Discovering the appropriate type of an entity in the Web of Data is still considered an open challenge, given the complexity of the many tasks it entails. Among them, the most notable is the definition of a generic and cross-domain ontology. While the ontologies proposed in the past function mostly as schemata for knowledge bases of different sizes, an ontology for entity typing requires a rich, accurate and easily-traversable type hierarchy. Likewise, it is desirable that the hierarchy contains thousands of nodes and multiple levels, contrary to what a manually curated ontology can offer. Such level of detail is required to describe all the possible environments in which an entity exists in. Furthermore, the generation of the ontology must follow an automated fashion, combining the most widely used data sources and following the speed of the Web.
deepschema.org is the first ontology that combines two well-known ontological resources, Wikidata and schema.org, to obtain a highly-accurate, generic type ontology which is at the same time a first-class citizen in the Web of Data.
deepschema.org has a public GitHub repository, where you can find a toolkit for all the phases (extracting, filtering, integration and crowdsourcing) of producing the final ontology.
These files were produced using the deepschema.org toolkit to which we gave as input i) the Wikidata 20160208 JSON dump and ii) the schema.org 2.2 release.