GICS Documents — Information Retrieval Fields

Some documents collected by Bilby are relevant to one or more nodes in the GICS tree (a taxonomy used by analysts to classify equities). Bilby labels such documents with the relevant GICS node, quantifies the degree of this relevance with a floating point score, and assigns ordinal rankings to each day's documents, based on this score.

Field NameTypeExample/Possible ValuesDescription
relevant_gicsstring2030-TransportationThe GICS node that is relevant to the underlying document. The format is {code}-{name}, where code is the GICS code and name is the GICS name.
theme_relevance_scorenumber0value1The relevance score of the underlying document to the GICS tree node given by the value relevant_gics.
theme_relevance_ranknumber1, 2, 3, or 4, and so on.The rank of the underlying document. This is measured by relevance_score, compared to other gics-related documents published on the same day.

relevant_gics

Definition: The GICS node that is relevant to the underlying document.

Possible values: Any of the nodes on the GICS taxonomy tree. For example:

  1. (Sector level): 10-Energy
  2. (Industry group level): 2030-Transportation
  3. (Industry level): 251020-Automobiles
  4. (Sub-industry leve;): 30202010-Agricultural products and services.

theme_relevance_score

Definition: Bilby has created a classifier model for each node on the GICS tree. Each model computes the relevance of the underlying document to the given commodity, and assigns this value to the theme_relevance_score variable.

Possible values: : A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


theme_relevance_rank

Definition: On any given day, the document with the Nth-highest theme_relevance_score is assigned a theme_relevance_rank of N. For example, the highest-scored document is given a rank of 1; the second-highest, a rank of 2, and so on.

Possible values: Positive integers 1, 2, 3, ...