Basic Documents — Sector Fields

The sector fields reflect the economic sector to which the document corresponding to a given data row belongs, as determined by Bilby's classifier models. The class labels comprise the eleven sectors defined by the GICS classification standard, along with two additions: "Macro" and "Not Relevant".

Field NameTypeExample/Possible ValuesDescription
sector_predictionstring[] (array of two strings)Example: ["Energy", "Information Technology"] (possible values include Energy, Materials, Industrials, Consumer Discretionary, Consumer Staples, Health Care, Financials, Information Technology, Communication Services, Utilities, Real Estate, Macro, Not Relevant)A list of length 2 containing top two sector prediction of the document.
sector_probability_energynumber0value1The probability of the document being in the energy sector.
sector_probability_materialsnumber0value1The probability of the document being in the materials sector.
sector_probability_industrialsnumber0value1The probability of the document being in the industrials sector.
sector_probability_consumer_discretionarynumber0value1The probability of the document being in the consumer discretionary sector.
sector_probability_consumer_staplesnumber0value1The probability of the document being in the consumer staples sector.
sector_probability_health_carenumber0value1The probability of the document being in the health care sector.
sector_probability_financialsnumber0value1The probability of the document being in the financials sector.
sector_probability_information_technologynumber0value1The probability of the document being in the information technology sector.
sector_probability_communication_servicesnumber0value1The probability of the document being in the communication services sector.
sector_probability_utilitiesnumber0value1The probability of the document being in the utilities sector.
sector_probability_real_estatenumber0value1The probability of the document being in the real estate sector.
sector_probability_macronumber0value1The probability of the document being in the macro sector.
sector_probability_not_relevantnumber0value1The probability of the document being in the not relevant sector.

Note: As detailed in the above table, each of the thirteen fields of the form sector_probability_<sector> (where <sector> ranges over the eleven GICS sectors, plus the Macro and Not Relevant labels) represents a probability. However, these probabilities are computed by thirteen independent binary classification models. As a result, these thirteen fields will NOT sum to 1.0, in general, for any given data row.


sector_prediction

Definition: The two MSCI GICS sectors that are most relevant to the underlying document.

Possible values: Either:

  1. One of the eleven standard GICS sectors, namely Energy, Materials, Industrials, Consumer Discretionary, Consumer Staples, Health Care, Financials, Information Technology, Communication Services, Utilities, Real Estate, or else;
  2. Macro — Relevant to the macroeconomy, or else;
  3. Not Relevant — Neither relevant to any GICS sector, nor relevant to the macroeconomy.

sector_probability_energy

Definition: The probability that the underlying document relates to the energy sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_materials

Definition: The probability that the underlying document relates to the materials sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_industrials

Definition: The probability that the underlying document relates to the industrials sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_consumer_discretionary

Definition: The probability that the underlying document relates to the consumer_discretionary sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_consumer_staples

Definition: The probability that the underlying document relates to the consumer_staples sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_health_care

Definition: The probability that the underlying document relates to the health_care sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_financials

Definition: The probability that the underlying document relates to the financials sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_information_technology

Definition: The probability that the underlying document relates to the information_technology sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_communication_services

Definition: The probability that the underlying document relates to the communication_services sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_utilities

Definition: The probability that the underlying document relates to the utilities sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_real_estate

Definition: The probability that the underlying document relates to the real_estate sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_macro

Definition: The probability that the underlying document relates to the macro sector.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).


sector_probability_not_relevant

Definition: The probability that the underlying document is neither relevant to any of the eleven GICS sectors, nor relevant to macro.

Possible values: A float in the interval [0, 1] (i.e., between 0 and 1, inclusive).