Counting Words in a Text

Hello everyone, I'm new to processing and I'm struggling to understand some concepts. Arrays is one that I'm having problems with, and though I understand the concept and application, I just don't really get them...

Now, what I would like to do is the following: I have a txt file that I want processing to read and tell me how many times does each word repeat. I would like that analysis to be saved as a txt or csv file.

I found a similar example by Daniel Shiffman that I slightly modified in order to work with my text, it does display the results on screen but I don't know how to save them to a file... a problem I also have is that once it counts every word on my file it loops, it does not stop.

Here's Daniel's code with my modifications:

        // Learning Processing
        // Daniel Shiffman
        // http://www.learningprocessing.com

        // Example 18-6: Analyzing King Lear

        PFont f;              // A variable to hold onto a font
        String[] kinglear;    // The array to hold all of the text
        int counter = 0;   // Where are we in the text (start later b/c Project Gutenberg has licensing info at beginning)
        int y = 10;

        // We will use spaces and punctuation as delimiters
        String delimiters = " ,.?!;:[]";

        void setup() {
          size(200,800);

          // Load the font


          // Load King Lear into an array of strings
          String url = "data/text.txt";
          String[] rawtext = loadStrings(url);

          // Join the big array together as one long string
          String everything = join(rawtext, "" );

          // All the lines in King Lear are first joined as one big String and then split up into an array of individual words. 
          // Note the use of splitTokens() since we are using spaces and punctuation marks all as delimiters.  
          kinglear = splitTokens(everything,delimiters);
          frameRate(5);
        }

        void draw() {



          // Pick one word from King Lear
          String theword = kinglear[counter];

          // Count how many times that word appears in King Lear
          int total = 0;
          for (int i = 0; i < kinglear.length; i ++ ) {
            if (theword.equals(kinglear[i])) {
              total ++;
            }
          }

          // Display the text and total times the word appears

          fill(0);
          text(theword + " : " + total,10,y);
          y = y + 40;
          stroke(0);
          fill(175);
          rect(10,50,total/4,20);

          // Move onto the next word
          counter = (counter + 1) % kinglear.length;
        }

Answers

  • edited June 2015

    look at saveStrings and noLoop()

  • Thanks, let me try that!

  • edited June 2015

    your achitecture

    at the moment you use draw() itself to increase counter and go from word to word

    if your goal in this sketch is just to make the file (which you then read with another sketch):

    • you might as well use a second (outer) for loop around the lines you have in draw() now

    Storing

    then you store this line text(theword + " : " + total,10,y); into a string array

    listOfWords[counter] = theword + ":" + total;
    

    and after the 2 nested for-loops use

    saveStrings to burn listOfWords to the hard drive

    and then say

        noLoop();
    } // end of draw
    

    ;-)

  • Thanks Chrisir, When I add

    listOfWords[counter] = theword + ":" + total;

    I get a NullPointerException

    I'm defining the array as follows:

    String[] listOfWords;

  • At this point, this is how my draw looks:

            void draw() {
    
    
    
              // Pick one word from King Lear
              String theword = kinglear[counter];
    
              // Count how many times that word appears in King Lear
              int total = 0;
              for (int i = 0; i < kinglear.length; i ++ ) {
                if (theword.equals(kinglear[i])) {
                  total ++;
                }
              }
    
              // Display the text and total times the word appears
    
              fill(0);
    
              text(theword + " : " + total,10,y);
              listOfWords[counter] = str(theword + ":" + total);
              saveStrings("totals.txt", listOfWords);
              y = y + 40;
              stroke(0);
              fill(175);
              rect(10,50,total/4,20);
    
              // Move onto the next word
              counter = (counter + 1) % kinglear.length;
    
    
            }
    

    The string (str) command is not working here, I thought I needed to convert the contents to string in order to store them in listOfStrings

  • edited June 2015

    that's very bad.

    you haven't done what I wrote

    you must use two for-loops and after them use saveStrings....

    your array

    String[] listOfWords;
    

    and in setup at the end

    listOfWords = new String [kinglear.length];
    

    The string (str) command

    The string (str) command is not working here

    listOfWords[counter] = theword + ":" + str(total);
    

    ;-)

Sign In or Register to comment.