We are about to switch to a new forum software. Until then we have removed the registration on this forum.
Hello,
I am trying to do the following:
Load in a single sentence (unknown number of words but not many). (This part doesn't need solved, I already have sentences coming in).
Split into individual words, Save words,
load next string
Split string into words, check if any of the words are already saved (and if so, add 1 to the quantity of that specific word). If word isn't already saved, save it.
Repeat with x number of sentences.
Essentially the idea is to build up an arrayList(?) of words, and then display the top 4 most frequently used words in the sketch window.
Note. I've set up a bit of code that removes all common words such as "the" "and""to" etc.
Here is the relevant piece of code I have been experimenting with (Only addressing part of the issue at the moment) :
String[] splitWords = split(sentence, " "); //split sentence into individual words
int numberOfWords2 = splitWords.length; //count number of words in sentence
for (int p=0; p<=splitWords.length; p++) { //add words to array list
myWords.append(splitWords[p]);
}
sentenceCount = sentenceCount +1; //count number of sentences
println(sentenceCount);
numberOfWords = numberOfWords+numberOfWords2; //total number of words
The main issue I am having is adding the words to the arrayList AFTER the existing words, rather than replacing them.
I am also unsure of the best way to keep a count of how many times each word has appeared across all sentences.
Thanks!
Answers
I think you want this
https://www.processing.org/reference/HashMap.html
it has been done many times, try google it here in the forum
Though not as performant as a Hashmap<String, Integer>, container class IntDict + increment() method is the easiest way to pull that out: *-:)
@Chrisir , why do you
import java.util.Map;
if you end up not using it? :-/import java.util.Map;
is needed in the broader scheme of things, e.g. in my 2nd sketch ;-)I'd rather
import java.util.Map.Entry;
and used:for (Entry<String, Integer> me : hm.entrySet()) {}
For
import java.util.Map;
I'd use it for: ;)final Map<String, Integer> hm = new HashMap<String, Integer>();
A more simplified addWord() for @Chrisir's Hashmap<String, Integer> version: :ar!
Invoke it like this:
for (String word : testList) addWord(hm, word);
B-)Wouldn't it crash npe if the word was new?
I'm checking for
null
inside put():count == null? 1 : count + 1
And method get() returns
null
in case "key" doesn't exist or the value was alreadynull
:http://docs.Oracle.com/javase/8/docs/api/java/util/Map.html#get-java.lang.Object-
ah, is this because you used
Integer
instead ofint
?Because this gives an npe (null pointer exception) if
wordLocal
is new I think :Primitive datatypes can't have
null
assigned to them.And Hashmap<String, Integer> has its values already as Integer objects after all. >-)
So it was only natural to have some Integer variable to recieve from get(). :>
thanks, mate!
Great thanks, the next step I am having trouble with is:
Instead of the test array, I want to use the words from a tweet. So when the tweet is received, I need to split it into individual words and then add each word to the array. This could potentially become a pretty large array.
The main difficulties I am having are:
Instead of the test string array I want to add each word from a tweet, and then keep adding words of future tweets.(Should I use an array, arrayList of HashMap for this since the size is unknown).
As tweets come in I want to update 4 variables:
variable 1 = first most common word across all tweets variable 2 = second most common word across all tweets variable 3 = third most common word across all tweets variable 4 = fourth most common word across all tweets
...then print the 4 most common words after each tweet is loaded. At the moment tweets are coming in as a string named 'tweet'.
Thanks again.
you could add the incoming to an ArrayList of String eg
and also add them to the hashmap (see above)
for 4 highest ranking
after the line 48 insert (pseudo code)
Perhaps after adding to hashMap and counting the words, you don't need keep them im the "receiver" ArrayList. If so you may look for a FIFO structure. Never used, but something like Deque i guess. @GoToLoop will know, I'm sure ;)
One final question (hopefully!):
I am trying to get the follow piece of code to work:
I must be doing something wrong as it doesn't like the contents of the for loop.
Thanks again!