You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a first pass, we will do the following labelling.
Bookkeeping spreadsheet: here. Make sure to add to this every time you create a new article, creating a new id for it and putting in the link.
For each article, create a text file with name the article_id.txt with the following format:
Line 1: article_id (a001, a002 ...)
Line 2: URL
Line 3: Headline
Line 4: Byline (If multiple authors, separate by semicolon ("Poorna Kumar; Viswajith Venugopal"))
Line 5 onwards: Body
(For TSVs, article_id can be a001p, a002p, ... for Poorna's and a001v, a002v, ... for Viswa's.)
Now, the annotation is in a text file with name article_id.tsv, and is of the following format (one line per person mentioned):
Full Name, Gender, Number of times mentioned (only by part or full name, NOT by pronoun), Says something (yes/no), Number of words quoted, Source/subject (src/sub), Adjectives (a comma separated list), Expert/non-expert, Profession/Role(s)
UPDATE (Week of March 13th): As per Maneesh's instructions, I'm also adding a 'Quotes' column at the end of new articles I annotate. These contain the raw quotes that the person says. Different quotes that a person says are delimited by the special token ''.
The text was updated successfully, but these errors were encountered:
List of small problems, to be looked at by Viswa if possible:
a068: Article on Afghan cricket scene. Really not clear on who is a source versus a subject.
a069: Article on American citizens detained in North Korea. Not clear about whether the detainees are subjects or neither source nor subject.
a070: Article on Met's Opera House. Should we mark Verdi and Strauss as mentions?
a096: Is Hughes a source? (I think yes). In that case, this is a good example of an article where the source is in the first paragraph. Also, in general, is Sean Spicer an expert source? In this article I have called him a source (debatable) and a non-expert (debatable).
As a first pass, we will do the following labelling.
Bookkeeping spreadsheet: here. Make sure to add to this every time you create a new article, creating a new id for it and putting in the link.
For each article, create a text file with name the article_id.txt with the following format:
Line 1: article_id (a001, a002 ...)
Line 2: URL
Line 3: Headline
Line 4: Byline (If multiple authors, separate by semicolon ("Poorna Kumar; Viswajith Venugopal"))
Line 5 onwards: Body
(For TSVs, article_id can be a001p, a002p, ... for Poorna's and a001v, a002v, ... for Viswa's.)
Now, the annotation is in a text file with name article_id.tsv, and is of the following format (one line per person mentioned):
Full Name, Gender, Number of times mentioned (only by part or full name, NOT by pronoun), Says something (yes/no), Number of words quoted, Source/subject (src/sub), Adjectives (a comma separated list), Expert/non-expert, Profession/Role(s)
UPDATE (Week of March 13th): As per Maneesh's instructions, I'm also adding a 'Quotes' column at the end of new articles I annotate. These contain the raw quotes that the person says. Different quotes that a person says are delimited by the special token ''.
The text was updated successfully, but these errors were encountered: