Skip to content

COVID-19 related virus data, environmental data and policy data

Notifications You must be signed in to change notification settings

stccenter/COVID-19-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STC COVID-19 Dataset

License: CC BY 4.0

This data repository stores COVID-19 virus case and related natural and social factors (e.g. environmental observation, policy index) in multi-scale based on ISO standard.

Data Organization

Datasets are organized by region area ranging from global to countries as shown below. Underneath each folder, multi-scale daily reports and summary reports are provided separately.

Field Description

Daily Data

Daily data provides automatically updated information of COVID-19 cases, and related attributes daily.

Attribute Name Description Format Example
date The date representing the current day in which the data represents. UTC time is used for this dataset, all values will calculated before the end of UTC time of the date. Date (YYYY/MM/DD) in UTC 2020/04/09
country_name Name of the country. string United States
iso3 3 digit ISO country codes. varchar(3) USA
admin1_name The name for admin 1 level. string Virginia
hasc1 This will represent the Hierarchical administrative subdivision codes (HASC) for admin 1 level. string US.VA (for Virginia, United States)
local_id1 This will represent the ID for specific admin 1 level. ID that represents the country's admin 1 level string VA (for Virginia, United States)
admin2_name The name for admin 2 level. string Fairfax County
hasc2 This will represent the Hierarchical administrative subdivision codes (HASC) for admin 2 level. string US.VA.FX (for Fairfax, Virginia, United States)
local_id2 This will represent the ID for specific admin 2 level. ID that represents the country's admin 2 level. string 51059 (for Fairfax, Virginia, United States)
confirmed The number of confirmed cases. integer 777
death The number of death cases. integer 19
recovered The number of recovered cases. (might be null for admin 2 level) integer null
Miscellaneous Other data attributed to our dataset. TBD TBD

Summary Data

Summary data records the COVID-19 cases, and related attributes, to show the timeline of cases.

Attribute Name Description Format Example
country_name Name of the country. string "US"
iso3 3 digit ISO country codes. varchar(3) USA
admin1_name The name for admin 1 level. string State for USA
date The date representing the current day in which the data represents. UTC time is used for this dataset, all values will calculated before the end of UTC time of the date. UTC YYYY/MM/DD

Tutorial - Visualize Virus Cases on Map using QGIS

Overall Data Sources by Country

Legend for data source and operation status

Country / Region Continent Admin level Data Source Temporal Coverage Operation Status
Global Global 0 2020/1/22 to current
United States North America 1 , 2 admin0: 2020/1/22 to current, admin1: 2020/1/27 to current
China Asia 1 , 2 admin0: 2020/1/22 to current, admin1: 2020/1/24 to current
Canada North America 1 2020/1/26 to current
Australia Oceania 1 2020/1/27 to current
Italy Europe 1 , 2 2020/2/24 to current
Germany Europe 1 2020/2/29 to current
Austria Europe 1 2020/3/4 to current
Brazil South America 1 2020/2/26 to current
Chile South America 1 2020/3/2 to current
Japan Asia 1 2020/1/15 to current
Russia Europe 1 2020/3/22 to current
South Africa Africa 1 2020/3/5 to current
Croatia Europe 1 2020/3/21 to current
Sweden Europe 1 2020/3/16 to current
India Asia 1 2020/3/10 to current
Hungary Europe 1 2020/3/31 to current
Denmark Europe 1 2020/5/20 to current
Ukraine Europe 1 2020/4/5 to current
Latvia Europe 1 2020/3/19 to current
Albania Europe 1 2020/4/22 to current
Haiti North America 1 2020/3/19 to current
Romania Europe 1 2020/4/2 to current
Mexico North America 1 2020/4/25 to current
Nigeria Africa 1 2020/2/27 to current
Pakistan Asia 1 2020/3/10 to current
Bolivia South America 1 2020/6/4 to 2020/7/29
Guatemala North America 1 2020/3/15 to 2020/8/14
El Salvador North America 1 2020/6/6 to 2020/7/4
Switzerland Europe 1 2020/6/1 to 2020/8/10
Bulgaria Europe 1 2020/6/6 to 2020/8/10

Recommended Citation

  • Sha, D., Liu, Y, Liu, Q., Li, Y., Tian, Y., Beaini, F., Zhong, C., Hu, T., Wang, Z., Lan, H., Zhou, Y., Zhang, Z. and Yang, C., 2020. A spatiotemporal data collection of viral cases for COVID-19 rapid response, Big Earth Data, pp.1-21. DOI: 10.1080/20964471.2020.1844934
@article{doi:10.1080/20964471.2020.1844934,
  author = { Dexuan   Sha  and  Yi   Liu  and  Qian   Liu  and  Yun   Li  and  Yifei   Tian  and  Fayez   Beaini  and  Cheng   Zhong  and  Tao   Hu  and  Zifu   Wang  and  Hai   Lan  and  You   Zhou  and  Zhiran   Zhang  and  Chaowei   Yang },
  title = {A spatiotemporal data collection of viral cases for COVID-19 rapid response},
  journal = {Big Earth Data},
  volume = {0},
  number = {0},
  pages = {1-21},
  year  = {2020},
  publisher = {Taylor & Francis},
  doi = {10.1080/20964471.2020.1844934}
  }
  • Liu, Q., Liu, W., Sha, D., Kumar, S., Chang, E., Arora, V., Lan, H., Li, Y., Wang, Z., Zhang, Y. and Zhang, Z., 2020. An Environmental Data Collection for COVID-19 Pandemic Research. Data, 5(3), p.68.
@article{liu2020environmental,
  title={An Environmental Data Collection for COVID-19 Pandemic Research},
  author={Liu, Qian and Liu, Wei and Sha, Dexuan and Kumar, Shubham and Chang, Emily and Arora, Vishakh and Lan, Hai and Li, Yun and Wang, Zifu and Zhang, Yadong and others},
  journal={Data},
  volume={5},
  number={3},
  pages={68},
  year={2020},
  publisher={Multidisciplinary Digital Publishing Institute}
}
  • Yang, C., Sha, D., Liu, Q., Li, Y., Lan, H., Guan, W.W., Hu, T., Li, Z., Zhang, Z., Thompson, J.H. and Wang, Z., 2020. Taking the pulse of COVID-19: A spatiotemporal perspective. International Journal of Digital Earth, pp.1-26.
@article{yang2020taking,
  title={Taking the pulse of COVID-19: A spatiotemporal perspective},
  author={Yang, Chaowei and Sha, Dexuan and Liu, Qian and Li, Yun and Lan, Hai and Guan, Weihe Wendy and Hu, Tao and Li, Zhenlong and Zhang, Zhiran and Thompson, John Hoot and others},
  journal={International Journal of Digital Earth},
  pages={1--26},
  year={2020},
  publisher={Taylor \& Francis}
}

Source Changing Log

People Contribution & Credit

  • Phil Yang, PI and supervisor.
  • Wendy Guan, Co-PI
  • Shuming Bao, colloborator
  • Dexuan Sha, project leader, metadata and standard design, crawler and ETL development, operation management.
  • Yun Li, GitHub management, data report generation and quality control.
  • Qian Liu, Environmental factor design, acquisition and preprocessing.
  • Chen Zhong, data crawler and ETL development.
  • You Zhou, policy, news and publication collection, coding, and labelling. Daily operation and data quality control.
  • Yifei Tian, data crawler and ETL development.
  • Fayez Beaini, data source collection and evaluation, quality control.
  • Tao Hu, cooperation leader from Harvard University and China Data Lab.
  • Zifu Wang and Hai Lan, IT infrastructure and network security support.
  • Zhiran Zhang, visualization
  • Wei Liu, data processing
  • Akhil Kumar, data validation.
  • Swetha Bhattaram, data validation.
  • Yogya Kalra, data validation.

Disclaimer

All data in this repository was collected/calculated/calibrated from multiple publicly available data sources that do not always agree. While we'll try our best to keep the information up to date and correct, we make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, with respect to the data. We do not bear any legal responsibility for any consequence caused by the usage of data provided. Reliance on the data for medical guidance or use of the data in commerce is strictly prohibited. NSF STcenter hereby disclaims any and all representations and warranties with respect to the data repository, including accuracy, fitness for use, and merchantability. For countries where there are internal disputes and sensitive region or area, we do not include that part of data in our datasets. If you are interested in this part of data, you can contact us directly.

License

The dataset is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

About

COVID-19 related virus data, environmental data and policy data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published