Curently i am working on a project where i need to extract text from HTML file, tried to use proHTML library but could not manage to get my head arround how to make it to return and parse text from HTML file I have got. Inside the file i have script looking like so:
<abbr class="time published" title="2011-07-31T22:35:54+0000">July 31, 2011 at 15:35</span>
<div class="msgbody">
oh good </div>
</div>
<div class="message reply">
<span class="profile fn">
Cott Wood</span>
<abbr class="time published" title="2011-07-31T22:35:57+0000">July 31, 2011 at 15:35</span>
<div class="msgbody">
yeas i am ok </div>
</div>
<div class="message reply">
<span class="profile fn">
Cott Wood</span>
<abbr class="time published" title="2011-07-31T22:36:02+0000">July 31, 2011 at 15:36</span>
<div class="msgbody">
what you done for it?
</div>
</div>
<div class="message reply">
<span class="profile fn">
John Johnson</span>
<abbr class="time published" title="2011-08-01T12:13:14+0000">August 1, 2011 at 5:13</span>
<div class="msgbody">
oh is it how come?
</div>
</div>
So basically i need to extract the name which is in bold itallic at the top then compare it to the names through out messages to find the messages (ones in bold) i need and extract only those which match the name... I would appreciate any help or guidance of how i could achieve it. :) thank you