We are about to switch to a new forum software. Until then we have removed the registration on this forum.
hye,
I need some help for my programming.Actually i,m plan to do a text visualization for the crime news. i would like to do something like this.
http://www.neoformix.com/2007/TextScope1.png
the different is the text in the form of online news like this:
NORITTA MURDER APPEAL: FEDERAL COURT'S DECISION OUT TOMORROW
Date: 27-03-2008
Author: / SSA AFY JR
COURT-NORITTA
PUTRAJAYA, March 27 (Bernama) -- The Federal Court will deliver its
judgment tomorrow in an appeal by the prosecution against the acquittal of Shah
Alam City Council engineer Hanif Basree Abdul Rahman of murdering business
development executive Noritta Samsuddin.
Hanif, 39, was accused of killing Noritta, 22, in her condominium, D-7-1,
Kondominium Puncak Prima Galleria, Jalan 17, Sri Hartamas, Kuala Lumpur,
between 1.30am and 4am on Dec 5, 2003.
The trial which received wide publicity, lasted 29 days and saw the
prosecution calling 34 witnesses.
On July 2004, Kuala Lumpur High Court Judge Datuk Abdull Hamid Embung
acquitted Haniff without calling for his defence after ruling, among others,
that the prosecution had failed to prove he was the last person to have sex
with Noritta because there were two mystery men involved.
The Court of Appeal on Jan 29, 2005 upheld Hamid's decision and commented
that the prosecution had presented insufficient evidence to show that Noritta's
death was caused by Hanif's act.
The Federal Court panel comprising Datuk Arifin Zakaria, Datuk Nik Hashim
Ab. Rahman, Datuk Hashim Yusoff, Datuk Zulkefli Ahmad Makinudin and
Tan Sri Zaki Tun Azmi, deferred their decisions on Nov 5, 2007, after
hearing submissions from deputy public prosecutor Wong Chiang Kiat and Hanif's
counsel Datuk V. Sittambaram.
-- BERNAMA
SSA AFY JR
I'm still a beginner in using processing.I,ve no idea how to input the text from many .txt file and how to highlight the main point of the news. The mind mapping also difficult for me.I don't know where to start.
I appreciate if you can help me with my code and give me some ideas..Maybe you can advice me where can i get some programming example to visualize text.
Regards,
Hana
Comments
There is good information in the book Generative Design by Bohnacker, Gross, Laub in the chapter on type. You might also want to take a look at Visualizing Data by Ben Fry. The latter is especially good on the under-the-hood tasks.
hello,
do you have the many .txt on your hard drive? Then we use loadStrings which can be done in a for-loop especially when the .txt are enumerated like crime1.txt, crime2.txt or so.
Or are they on the web?
Anyway. With text like this a automatic analysis could be hard. E.g. "engineer Hanif Basree Abdul Rahman" is once called this, the next time Hanif's act. But it's the same person (ok, Hanif may be ok as a word). Or Kuala Lumpur: 2 words but one city. The program should see the 2 words as one. Or Federal Court.
Apart from that: you can count all words. Then sort out words such as the, a, as or so, because no real content here. See http://www.ranks.nl/resources/stopwords.html
Then you could take 5 words most often used and display them in a big circle. Use Hashmap for counting.
(You could take the next 10 words most often used and display them on a circle around them.) Or you could place for each of the 5 words (A) the words after it in the original text (B) in a circle around it and make an arrow from each A to B. You could take only those 5 that come with a capital, so you assume it's a name or a city or a phrase like Federal Court. Sort out the words here that are at the beginning of a sentence.
You could just take the text between -- and -- when this is a given in all your many .txt files. Because what comes before or after might be of less interest. Or at least what comes after the 2nd -- occurance.
I think this is fairly easy. The books above are definetly worth reading.
A further analysis like what is verb, subject, object, is harder. Maybe you can find a list of citys or a list of things with two or more words (like Federal Court) and make sure your sketch recognizes them as one entity which is hard. Or once it is called murder, once the act. How can a sketch recognize that it means the same?
Greetings, Chrisir
Thanks a lot both of you,
Actually i save the news in the dropbox. Maybe i should save it in the laptop to make it easier to be input. I would try my best to code. I'll show you later. I'll do what you recommend.