Basic Documents — Fundamental Fields
The basic metadata values common to all documents collected by Bilby.
Field Name | Type | Example/Possible Values | Description |
---|---|---|---|
id (basic) | string | 614836f6732855d8b931dca9141665d7b405f8e402a1c6e68f008fe35e22414a | The unique identifier of the document. |
id (commodities) | string | f627348a244f879e43d41bede462a8ecdca944836d1663160c19e4bfb3641a01-aluminum | The unique identifier of the document. |
id (gics) | string | 65b5d92b7ec8dcf015ffdc0cf88a8fc3d47084422c3feab75fa7207c915d58b0-60-real-estate | The unique identifier of the document. |
source_line | string | official_line | The source line of the document. |
source_country | string | China | The source country of the document. |
source_language | string | Chinese | The source language of the document. |
utc_date | date | 2023-03-04T00:00:00.000Z | The UTC date of the document. |
published_at | timestamp | 2023-03-04T00:00:00.000Z | The published date of the document. |
id
Definition: A unique identifier for the document.
Example Values:
- For the
basic
dataset:614836f6732855d8b931dca9141665d7b405f8e402a1c6e68f008fe35e22414a
- For the
commodities
dataset:f627348a244f879e43d41bede462a8ecdca944836d1663160c19e4bfb3641a01-aluminum
- For the
gics
dataset:65b5d92b7ec8dcf015ffdc0cf88a8fc3d47084422c3feab75fa7207c915d58b0-60-real-estate
source_line
Each row in the API corresponds to a document. The source of the document is the
official name of its publisher, e.g. People’s Daily
. Bilby groups these
sources into six source lines, listed in the table below. Sources within a
common line play a similar role in the governance of China. For example, all
sources within the ministry
line are published by ministries, and all sources
within the regulatory_line
are published by regulation agencies.
Line | Description |
---|---|
official_line | Official media sources of the country. |
regulatory_line | Regulatory agencies of the country that produce legal documents. |
private_line | Private media sources of the country. |
ministry | Ministries of the country. |
SOE | State-Owned Enterprises. |
party | Political parties. |
bank | Banks. |
source_country
Definition: The country that the source belongs to.
Possible values: Currently, the only value for this field is China
.
source_language
Definition: The language of the original document.
Possible values: Currently, the only values for this field are Chinese
and
English
.
Note: Currently, fewer than one percent of the documents are in English.
utc_date
Definition: The UTC date of the document is the date in UTC time zone.
Example value: 2023-03-04T00:00:00.000Z
.
published_at
Definition: The published date of the document is the date in the original time zone of the document.
Example value: 2023-03-04T00:00:00.000Z
.