Selected Datasets

The OpenKG community has a collection of high-quality data in common sense, encyclopedia, finance, and medical care.

The Schema reference standard managed and maintained by OpenKG combines the characteristics of the Chinese language and specific application requirements in the Chinese field.

OneGraph

A knowledge graph project initiated and maintained by OpenKG that uses large language models to build and serve large language model applications.

Taking viruses and bacteria as the main body, it expands the content related to treatment and diseases, integrates encyclopedia knowledge, and forms the new coronavirus encyclopedia knowledge graph.

A large multimodal knowledge graph, which contains multimodal knowledge that can be applied to many fields such as natural language processing and computer vision.

Alibaba Group released the first large-scale open business knowledge graph to promote deep understanding of retail data.

A large-scale Chinese concept knowledge graph based on knowledge extraction, containing a large number of fine-grained concepts.

A Chinese knowledge graph for high school geography that can provide students with better computer-assisted education.

A large-scale multimodal academic knowledge graph, a novel multimodal academic knowledge graph for earth sciences.

A first attempt to construct a Chinese general knowledge graph by extracting structured data from open encyclopedia data.

A large-scale general domain structured encyclopedia developed and maintained by the Knowledge Factory Laboratory of Fudan University.

Peking University Chinese Encyclopedia Knowledge Graph is a knowledge base formed by automatically collecting knowledge from multiple sources such as Wikipedia, DBpedia, Baidu Encyclopedia, etc.

The encyclopedia knowledge graph carefully created by Goosegrass Technology includes things, facts, concepts, rules, etc.

A knowledge graph of common diseases that includes common diseases, symptoms, treatments, commonly used medicines, recommended recipes, etc.

The large-scale Chinese concept map developed and maintained by the Knowledge Factory Laboratory of Fudan University has an accuracy rate of over 95% for isa relationships.

This dataset is derived from 41 publicly published diabetes guidelines and consensus articles, covering the most extensive research content and hot areas in recent years.

A set of high-quality Chinese vocabulary compiled and launched by the Natural Language Processing and Social Humanities Computing Laboratory of Tsinghua University.

It extracts structured information from heterogeneous cross-language online encyclopedias and is the first large-scale knowledge graph that balances Chinese and English knowledge.

Harbin Institute of Technology released a system that automatically crawls entities and entity concepts from the Internet to form a general knowledge graph based on hierarchical relationships.

The Chinese Open Word Network collects and organizes important open knowledge bases and knowledge graph projects at home and abroad, and organizes and compiles relevant Chinese materials.

The clinical terminology published by Yidu Cloud is manually edited by Yidu Cloud doctors based on the real medical record distribution, providing a basis for the standardization of clinical terminology.

en_USEN