I am working on a text analysis thing, and I want to get the words next to every word, and how many times the combination appears.
I already have word counting working with a hashmap, but I can't think of a good way to tackle this function.
I was thinking a two-dimensional array would maybe be the direction I have to go, but I don't know if that's true.
So what I want is to look at every word in a text and save them, together with the following information: the word before it, and the word after it, and how many times this combination appears.
So for the text:
"July was eating apples she bought from a person who sold apples, who got them from the person who plucked the apples."
So this text would ideally generate information similar to this structure:
allWords{
apples {
before { eating: 1, sold: 1, the: 1},
after { she: 1, who: 1}
}
who{
before { person:2, apples: 1},
after{ sold: 1, got: 1, plucked: 1}
}
etc.
}
If this is even possible the next step would probably be to be able to set a range, so you can get the two, or five words before and after the word, or the word two places before or after.