Page Number:

Scientific Software Citation Data Dictionary

IS562 -- Metadata in Theory & Practice

Fall 2019

Version 1.0

December 9th, 2019

By Sam Walkow (swalkow2) and Nikolaus Parulian (nnp2)

Contents

Scientific Software Citation Data Dictionary        1

Contents        1

Introduction        2

Purpose/Intended Use        2

Data Model        3

Controlled Vocabularies        5

Publication Type        5

Publication Keywords (About)        5

Funding        5

Software Keywords (About)        5

Software License        5

Software Usage Type        5

Glossary        6

Extensibility/Future Directions        6

Limitations        7

Entity Semantics Units List        7

Entity Semantics Units Tables        9

Test Corpus / Project Proposal        46

Summary        46


Introduction

This document is a data dictionary written to provide instructions on how to create metadata records for scientific software citation using a schema specifically designed to capture metadata about software applications that contribute to academic publications, how they contribute, and to give credit to software maintainers and contributors of those software applications. The types of metadata captured for this schema include:

The Scientific Software Citation schema is an RDF structured schema that defines entities (people, publications, software, etc) and properties of those entities (names, identifiers, keywords, etc) and that links relevant properties together. This schema uses several schema.org defined entities, and introduces new entities that are unique to this schema.

Purpose/Intended Use

This schema is intended to illustrate how software supports academic publications by linking software usage and intended purpose directly to the outcomes shown in a publication such as figures, units of analysis, and discussion or conclusions in an effort to link quantitative results with the support software. Additionally, the schema describes the people and entities involved including the publication authors, software contributors and maintainers, funders and publishers. The schema aims to structure this metadata in a way that makes academic publication queries more sophisticated where publications could be searched by software used and vice versa. The purpose is twofold:

  1. To give credit to all persons involved in a publication
  2. To allow software to be an easily searchable and citatable component of academic literature

This data dictionary was written with the intention that authors would create metadata records for their own publications. However, the instructions should be comprehensive enough that anyone could create a scientific software citation metadata record.

This schema is intended to capture the metadata relevant at the time. This means that while some values are subject to change, the schema is meant to preserve items such as the version of software used at that time, by the contributors that were involved at that time. In this way, outcomes from software applications can be reproduced as the metadata accurately describes the necessary technical and preservation needs.

Data Model

The Scientific Software Citation model is based on the RDF conceptual model and is designed to link publications with one or more authors, software applications, and contributors. Beyond that, it is also designed to link one or more software application with one or more outcomes (scientific product that came from the software, such as a chart or a type of analysis) and with one or more type of usage or intended usage (such as scientific visualization, numerical analysis, statistics, etc).

The schema is designed with flexibility on the author’s end in mind, so that the authors of the publication can record the usage and products/outcomes of the software as they used it. We realize this could mean that the same software application could be recorded with different uses and different outcomes depending on the software, and this is intended. What should remain constant is the publication as the center of the model, the unique identifiers for persons (authors or contributors) and their work (publication or software) so items can be searched by those entities and credit can be given.

        The schema is a combination of items from schema.org and newly defined entities to describe the software linked to publications. Each entity type has several entity properties attached to it. Entity properties are meant to hold values that describe the publication and software relationship. Value types include text, numbers, date types, URLs, links to previous entities and blank nodes. There are more details in the definitions section.

Our custom schema entities are denoted with: Scs: Software Citation Schema. This is a customized schema that contains classes and properties we made as a compliment to the schema.org schema.

For the schema.org, we provided an Application Profile on how we use the schema.org classes and properties and fit it to the proposed Software Citation model.

As we can see from the figure, we have five major entities we want to capture using the schema.org by providing a custom Application Profile and introducing a new schema called Software Citation Schema (scs) that is supposed to complement and capture any parameters or properties that originally is not supported by schema.org. The explanation of each entity can be found on the Entity Semantic Unit List and Entity Semantic Unit Tables section.

For each entity, we also prepare a corpus target folder that is meant to separate different entities as follows:

http://softwarecitation.web.illinois.edu/corpus/author/

http://softwarecitation.web.illinois.edu/corpus/software/

http://softwarecitation.web.illinois.edu/corpus/repository/

http://softwarecitation.web.illinois.edu/corpus/developer/

Controlled Vocabularies

A number of controlled vocabularies are used in the following entities, with links to the entity table with further details:

  1. Publication Type

2.8 schema:PublicationType

These terms describe the different platforms and formats an academic work can be published from.

  1. Publication Keywords (About)

2.3 schema:about

These terms include the keywords used to tag a scholarly work, as seen in publications keywords section.

  1. Funding

3.6 schema: funder

These terms include the names of different funding bodies behind academic publications and software.

  1. Software Keywords (About)

3.4 schema:about

These terms include the keywords used to tag software repositories, as seen in a Github tags section.

  1. Software License

3.8 schema:license

These terms include the names of different legal licenses behind software applications.

  1. Software Usage Type

2.9.1.2.1.3 scs:usageType

These terms describe the various purposes researchers might have for software in their research or for the publication.

Glossary

In this section we cover a high level view of our data structure. We developed two semantic structures following the rdf schema:

  1. Entity type: defines the class

Ex. scs:Author

  1. Entity property: defines the property belong to a class

Ex. schema:givenName

Below is a definition of each row in the entities table:

  1. Entity type: Name of the entity class
  1. Entity properties : the properties linked to the defined class
  2. Subclass of: Name of the parent class the entity type descends from
  3. Definition: What values and metadata this class is meant to capture
  4. Rationale: Why this entity type is included in the schema

  1. Entity property: Name of the property
  1. Definition: What values and metadata this property is meant to capture
  2. Rationale: Why this property is included in the schema and the class
  3. Data constraint: The possible types of values this property accepts
  4. Obligation: Indicates whether this property is mandatory or optional
  5. Repeatable: Indicates whether this property is repeatable
  6. Usage notes: Explanation on how enter values for the property including use cases.

Extensibility/Future Directions

  1. A promising future direction would be automating parts of the schema creation. On the software side, certain properties could be taken directly from a code repository such as Github or Bit-bucket. This could ease some of the manual work and effort involved in recording this metadata. Other areas open of automation are the keywords and DOI. However, there are parts of this schema that will always have to be done manually, as several fields are subjective and up to the authors to decide, or that information doesn’t exist anywhere else.
  2. Future use cases could include leveraging the software outcome and software usage values as search parameters- it would be interesting to group publications and software based on their real world use cases when searching for information. It would also be interesting to see if this information could be included in software like Zotero, where the software used could live beside the publications in one local database for authors. Authors could see what software they’ve used, who wrote it, add tags, make notes, etc.
  3. This could also influence the page rank for open source software. When users search for software this metadata could help users find software that is reflective of their purpose and research needs.
  4. A future outcome could also be measuring the impact of open source software on academic work. Capturing accurate metrics of open source application has been much discussed, and this schema could help build the platform needed to make those metrics more robust and useful.
  5. This schema could also be used to demonstrate the need for funding support or to justify the usage of funding for the open source development.

Limitations

While this schema is intended to give as much credit as possible to all involved in a publication, we had to compromise on the software side, and have included instructions to only include the lead contributor or maintainer of a software application. Creators of the metadata records can repeat that entity and include as many code contributors as they want, however there are open source software applications that have hundreds of contributors at different levels of the project with different roles. It may be too arduous to include every contributor and this does limit the credit given to contributors, however automation may help with this in the future.

Along those lines, there isn’t an easy way to give credit to all the dependencies that a software application may be built on. For example, we use an open source visualization and analysis library called ‘yt’ as an example throughout this document, however we do not specify that yt depends on several other softwares such as Numpy and Matplotlib. There is space in this schema structure to specific as many software applications that depend on each other, however it is often the case that there are a large number of dependencies which is not reasonable to input manually. We hope automation in this area could make that addition to the schema possible.

Additional limitations in one way directions from some entities in our RDF structure. For example, schema:SoftwareSourceCode has space for a link to schema:SoftwareApplication, but not the other way around. This could limit the searchability impact this schema could have.  

Entity Semantics Units List

1. scs:Author

   1.1 schema:givenName

    1.2 schema:additionalName

    1.3 schema:familyName

    1.4 schema:affiliation

    1.5 schema:identifier

    1.6 schema:email

2. scs:ScholarlyArticle

    2.1 schema:identifier

    2.2 schema:name

    2.3 schema:about

    2.4 scs:author

    2.4.1 scs:AuthorOrder

      2.4.1.1 scs:author

      2.4.1.2 scs:order      

    2.5 schema:abstract

    2.6 schema:publisher

    2.7 schema:datePublished

    2.8 schema:publicationType

    2.9 scs:useSoftware

      2.9.1 scs:SoftwareOutcome

         2.9.1.1 schema:identifier

            2.9.1.2 scs:outcome

  2.9.1.2.1 scs:Outcome  

2.9.1.2.1.1 schema:articleBody

2.9.1.2.1.2 schema:pageStart

2.9.1.2.1.3 scs:usageType

     2.10 schema:url

3. schema:SoftwareApplication

3.1 schema:name

3.2 schema:alternate

3.3 schema:description

3.4 schema:about

3.5 schema:relatedLink

3.6 schema:funder

3.7 schema:version

3.8 schema:license

3.9 schema:isBasedOn

4. schema:SoftwareSourceCode

4.1 schema:codeRepository

4.2 schema:isBasedOn

4.3 schema:targetProduct

4.4 schema:description

4.5 schema:contributor

5. scs:CodeMaintainer

5.1 schema:identifier

5.2 schema:name

5.3 schema:sameAs


Entity Semantics Units Tables

Entity types

1. scs:Author

Entity properties

1.1 schema:givenName

1.2 schema:additionalName

1.3 schema: familyName

1.4 schema:affiliation

1.5 schema:identifier

1.6 schema:email

Subclass of

schema:Person

Definition

This class is an application profile for scs:Author. This entity defines this item that will be credited with working or contributing to work that makes up a publication. An Author class should be used to define an author, or person involved in the publication or the software that supported it. This entity should be a person that is mentioned in the publication.

Rationale

An author is included in scholarly publications. An author is an entity which is intended to be credited with the work of writing a publication.

It can be found on the first page of a publication.

Complete example

<author/#colinAllen>

        a scs:Author ;

        schema:identifier "orcid:/0000-0003-4497-1725"^^xsd:string ;

        schema:givenName "Colin"^^xsd:string ;

        schema:familyName "Allen"^^xsd:string ;

        schema:affiliation "Department of History and Philosophy of Science and Program in Cognitive Science, Indiana University"^^xsd:string ;

        schema:affiliation "Indiana University"^^xsd:string ;

        schema:email "colallen@indiana.edu" ;

        .

Entity properties

1.1 schema:givenName

Definition

First or given name of the author of a publication

Rationale

An author’s first name, which in the United States or western world is the word that appears first in the name and is part of how a person is identified.

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

This entity can be repeated for multiple given names. Depending on the publication name format, often only the first initial or several initials are included. This can be included in givenNam. Include hyphenated names as one name.

Example:

From this first page of a publication

Example:

  • schema:givenName “Matthew” ;
  • schema:givenName “Paul” ;
  • schema:givenName “S.C.O” ;
  • schema:givenName “T.” ;

Entity properties

1.2 schema:additionalName

Definition

Middle name or initial of the author of a publication

Rationale

An author’s middle name, which in the United States or western world is the word that appears second in a name.

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

Include hyphenated names as one name. This entity can be repeated for multiple additional names. Some publications only include the first initial or several initials, which can also  be included in givenName.

Example:

Use the same source as 1.1

  • schema:givenName “Matthew” ;

schema:additionalName “J.” ;

  • schema:givenName “Paul” ;
  • schema:givenName “S.C.O” ;
  • schema:givenName “T.” ;

schema:additionalName “H.” ;

Entity properties

1.3 schema:familyName

Definition

Last name of the author of a publication

Rationale

An author’s last name, which in the United States or western world is the word that appears last in a name is often how authors are identified and how they receive credit for their work.

Data constraint

Text

Obligation

Mandatory

Repeatable

No

Usage notes

Include hyphenated names as one name. This entity cannot be repeated to indicate multiple family names (since this is rare). Since this entity is how many authors are known, this field is mandatory so credit can be given to the author. If an author name is unclear or not formated in the western style, include the entire name is this entity.

Example:

Use the same source as 1.1

  • <#author1>

a scs:Author;

schema:givenName “Matthew” ;

schema:additionalName “J.” ;

schema:familyName “Turk” ;

  • schema:givenName “Paul” ;

schema:familyName “Clark” ;

  • schema:givenName “S.C.O” ;

schema:familyName “Glover” ;

  • schema:givenName “T.” ;

schema:additionalName “H.” ;

schema:familyName “Grief” ;

Entity properties

1.4 schema:affiliation

Definition

An author’s affiliation to an institute, company, entity or group

Rationale

An author’s affiliation is often included in publications and is an important note about the author’s professional identity.

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

Multiple affiliations can be accommodated with repeat entries, which can be text or a hyperlink. Include what is on the publication and in that format. We strongly recommend that only the name of the affiliation is used, as opposed to the name and address however that can be included.

Example:

Using the same source as 1.1

  • schema:givenName “Matthew” ;

schema:additionalName “J.” ;

schema:familyName “Turk” ;

schema:affiliation “Center for Astrophysics and Space Science” ;

schema:affiliation “University of California-San Diego” ;

  • schema:givenName “Paul” ;

schema:familyName “Clark” ;

schema:affiliation “Zentrum fur Astronomie der Universitat Heidelberg” ;

schema:affiliation “Institut fur Theoretische Astrophysik” ;

Entity properties

1.5 schema:identifier

Definition

An author’s unique identification method, often an ORCID or some other unique value assigned to an author.

Rationale

An author’s unique identification method so their work can be credited to the correct person. We strongly recommend the use of an ORCID.

Data constraint

Text

Obligation

Mandatory

Repeatable

Yes

Usage notes

We strongly recommend using an ORCID number, however any unique identifier can be used to uniquely describe the author. Be sure to specify the type of identifier you are using before entering the identifier itself. See example below.

Identifier value must follow this formatting specification:

<name_of_identifier>:/<value_of identifier>

Controlled vocabulary for name_of_identifier:

  • orcid:/
  • local:/ custom identifier for local usage

ORCID can be found from https://orcid.org/orcid-search/search

Multiple identifiers can be accommodated with repeat entries, which can be text or a hyperlink.

Example:

  • schema:givenName “Samantha” ;

schema:familyName “Walkow” ;

schema:identifier “orcid:/0000-0001-7329-1863”

If we are using email:

schema:givenName “Samantha” ;

schema:familyName “Walkow” ;

schema:identifier “local:/swalkow”

Entity properties

1.6 schema:email

Definition

Email address is a point of contact for an author so they can be reached and further credited with their work.

Rationale

Email addresses are often, but not always, included on a publication although sometimes they are difficult to retrieve. An author can be contacted or identifier this way.

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

Include the email that is on the publication, which can usually be found on the front page of the publication or webpage. Other ways to contact an author can be included here, but please specify what the contact is for.

Example:

Entity types

2. scs:ScholarlyArticle

Entity properties

2.1 schema:identifier

2.2 schema:name

2.3 schema:about

2.4 scs:author

   2.4.1 scs:AuthorOrder

      2.4.1.1 scs:author

      2.4.1.2 scs:order      

2.5 schema:abstract

2.6 schema:publisher

2.7 schema:datePublished

2.8 schema:publicationType

2.9 scs:useSoftware

    2.9.1 scs:SoftwareOutcome

    2.9.1.1 schema:identifier

     2.9.1.2 scs:outcome

  2.9.1.2.1 scs:Outcome  

2.9.1.2.1.1 schema:articleBody

2.9.1.2.1.2 schema:pageStart

2.9.1.2.1.3 scs:usageType

Subclass of

schema:ScholarlyArticle

Definition

A scholarly article is a written academic work, such as paper, article, journal article, chapter, conference paper, in either physical or digital form. The publication class is mainly derived from the class schema:ScholarlyArticle. This entity defines the paper, or item that relate to the actual work and later to be linked to authors and software applications. This is the center of the data model. One scholarly article entity should represent only one actual publication in the real world.

Rationale

ScholarlyArticle and SoftwareApplications that support them are the two entities we are trying to establish a relationship between, and also to define the nature of that relationship.

Complete Example

<article/#article1>

        a scs:ScholarlyArticle ;

        schema:identifier "doi:/10.1086/673276" ;

        schema:name "Cross-Cutting Categorization Schemes in the Digital Humanities"^^xsd:string ;

        schema:about "local:/digital humanities"^^xsd:string ;

        scs:author [

                scs:Author <author/#collinAllen> ;

                scs:order 1 ;

        ] ;

        schema:abstract """Digital access to large amounts of scholarly text presents both challenges ….."""^^xsd:string ;

        schema:publisher "University of Chicago Press"^^xsd:string ;

        schema:datePublished "2013" ;

        schema:publicationType "journal" ;

        scs:useSoftware [

                a scs:SoftwareOutcome ;

                schema:identifier <software/inpho> ;

                scs:outcome [

                        a scs:Outcome ;

                        schema:articleBody """AT THE INDIANA PHILOSOPHY ONTOLOGY PROJECT (InPhO) we have developed and are continuing to develop methods for categorizing and linking philosophical

ideas and thinkers."""^^xsd:string ;

                        schema:pageStart "1" ;

                        scs:usageType "development" ;

                ]

        ]

        .

Entity properties

2.1 schema:identifier

Definition

A unique identifier for the individual publication. We strongly recommend used the DOI.

Rationale

A unique identifier will ensure that individual publications can be found when searched and can be linked to other entities.

Data constraint

Text

Obligation

Mandatory

Repeatable

Yes

Usage notes

Identifier value must follow this formatting specification:

<name_of_identifier>:/<value_of identifier>

Controlled vocabulary for name_of_identifier:

  • doi
  • pubmedid
  • local: custom identifier for local usage

We strongly recommend using the DOI, but other identifiers can be used as long as they are unique. If that is the case, please indicate what type of identifier you are using.

Example:

  • schema:name “EFFECTS OF VARYING THE THREE-BODY MOLECULAR HYDROGEN FORMATION RATE IN PRIMORDIAL STAR FORMATION” ;

schema:identifier “doi:/10.1088/0004-637X/726/1/55” ;

Entity properties

2.2 schema:name

Definition

The name of the publication, as it appears on the publication

Rationale

Names or titles are how publications are identified, they also describe the subject matter, and are often how people search for publications.

Data constraint

Text

Obligation

Mandatory

Repeatable

No

Usage notes

Include the name that is on the publication, which can usually be found on the front page of the publication or webpage.

Example:

  • schema:name “EFFECTS OF VARYING THE THREE-BODY MOLECULAR HYDROGEN FORMATION RATE IN PRIMORDIAL STAR FORMATION” ;

Entity properties

2.3 schema:about

Definition

Keywords describing the content of the publication

Rationale

Keywords are often used as search terms and can identify a paper as part of a particular domain or area

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

We recommend you use the following controlled vocabulary from MESH (https://meshb.nlm.nih.gov/search),  acm (https://dl.acm.org/ccs/ccs_flat.cfm), or use the keywords as shown on the front page of the publication. If the keywords are not from a controlled vocabulary, you can specify that they are ‘local’ values.

Provide your about (keywords) as granularly as possible.

‘About’ value must follow this formatting specification:

<name_of_keyword>:/<value_of_the_keyword>

Controlled vocabulary for the name_of_keyword:

  • acm
  • mesh
  • local: custom identifier for local usage

Example:

Following the 2.1 example:

  • schema:name “EFFECTS OF VARYING THE THREE-BODY MOLECULAR HYDROGEN FORMATION RATE IN PRIMORDIAL STAR FORMATION” ;

schema:about  “local:/cosmology: theory” ;

schema:about  “local:/galaxies: formation” ;

schema:about  “local:/stars: formation” ;

schema:about  “local:/regions” ;

Another example from acm

  • schema:name “Compiler Transformations for High-Performance Computing” ;

schema:about  “acm:/Concurrent Programming” ;

schema:about  “acm:/Processors—compilers” ;

schema:about  “acm:/Automatic Programming-program transformation” ;

schema:about  “local:/compilation” ;

schema:about  “local:/dependence analysis” ;

schema:about  “local:/vectorization” ;

Entity properties

2.4 scs:Author

Definition

The order of the authors, if applicable, as they appear on the publication

Rationale

Author order can be an indication of the amount of work contributed, seniority, or other socially important factors and should be included in the record to help describe record accuracy. This may also be a factor in search terms.

Data constraint

scs:Author;

scs:AuthorOrder;

Obligation

Optional

Repeatable

Yes

Usage notes

Details:

scs:Author: refers to predefined entity from class Author

scs:AuthorOrder: is a blank node entity that provide flexibility on preserving ordinality of the author given a ScholarlyArticle entity

Record the authors by name in a list, indicating the order as they appear on the publication.

Example:

  • Example if we don’t want to preserve order

schema:name “EFFECTS OF VARYING THE THREE-BODY MOLECULAR HYDROGEN FORMATION RATE IN PRIMORDIAL STAR FORMATION” ;

scs:author <#author1> ;

  • Example if we want to maintain author order

schema:name “EFFECTS OF VARYING THE THREE-BODY MOLECULAR HYDROGEN FORMATION RATE IN PRIMORDIAL STAR FORMATION” ;

scs:author [

 a scs:AuthorOrder ;

 scs:Author  <#author1>;

 scs:order: 1 ;

] ;

scs:authorOrder [

 a scs:AuthorOrder ;

 scs:Author <#author2> ;

 scs:order: 2 ;

] ;

Entity properties

2.5 schema:abstract

Definition

The abstract of the publication, which is a brief summary of the publication contents at the beginning of the publication.

Rationale

Abstracts are used to provide a brief overview of the publication and often appear in the online searches which help users determine if they will read further.

Data constraint

Text

Obligation

Optional

Repeatable

No

Usage notes

Use the abstract as it appears on the publication.

Example:

  • schema:name “EFFECTS OF VARYING THE THREE-BODY MOLECULAR HYDROGEN FORMATION RATE IN PRIMORDIAL STAR FORMATION” ;

schema:abstract “The transformation of atomic hydrogen to molecular hydrogen through three-body reactions is a crucial stage in the collapse of primordial, metal-free halos, where the first generation of stars (Population III stars) in the universeis formed. However, in the published literature, the rate coefficient for this reaction is uncertain by nearly an order of magnitude. We report on the results of both adaptive mesh refinement and smoothed particle hydrodynamics simulations of the collapse of metal-free halos as a function of the value of this rate coefficient. For each simulation method, we have simulated a single halo three times, using three different values of the rate coefficient. We find that while variation between halo realizations may be greater than that caused by the three-body rate coefficient being used, both the accretion physics onto Population III protostars as well as the long-term stability of the diskand any potential fragmentation may depend strongly on this rate coefficient.”

Entity properties

2.6 schema:publisher

Definition

The name of the publisher such as the journal or book publisher

Rationale

Often users search by publisher when looking for scholarly articles, which can also indicate the subject matter.

Data constraint

schema:Organization ;

schema:Person ;

Obligation

Optional

Repeatable

Yes

Usage notes

The publisher information should be included at the top of the front page of the publication, or the website it is hosted on. We are only expecting the journal name, but you could include the page and volume details as they appeared on the publication.

Example:

  • schema:identifier “doi:/10.1088/0004-637X/726/1/55” ;

schema: publisher “The Astrophysical Journal” ;

Entity properties

2.7 schema:datePublished

Definition

The date the publication was published

Rationale

Publication dates can indicate relevance, and are also used as search parameters by users.

Data constraint

Date

Obligation

Optional

Repeatable

No

Usage notes

The date published information should be included at the top of the front page of the publication, or the website it is hosted on. We recommend your follow the format “year-month-day” (YYYY-MM-DD) if it is a text.

Example:

  • schema:identifier “doi:/10.1088/0004-637X/726/1/55” ;

schema: datePublished: “2011-01-05”

Entity properties

2.9 schema:PublicationType

Definition

The type of publication, as in what format did the publication appear in when published.

Rationale

Different types of publications can indicate the purpose, review process, and domain.

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

We strongly recommend using a term from the MESH controlled vocabulary https://www.nlm.nih.gov/mesh/pubtypes.html.

For example:

  • pre-print,
  • journal article,
  • conference paper,
  • book chapter

Example:

  • schema:identifier “doi:/10.1088/0004-637X/726/1/55” ;

schema:publicationType “journal article” ;

Entity types

2.9 scs:citeSoftware

Definition

This property represents citations of software from a publication to the SoftwareApplication entity. One publication can have many citations / links to any SoftwareApplication entity. Besides the software application linked to the citation entity, we can also provide outcomes to capture the usage of the software and how it is represented on the paper. This is important to determine the usage of the software on the paper.

Rationale

Because we want to understand the relation between publication and software, an entity of publication must be created first. One record of this software citation property must have one attachment to the SoftwareApplication or scs:SoftwareOutcome. At least one ScholarlyArticle entity must have at least one scs:citeSoftware property for the purpose of this schema. The scs:SoftwareOutcome entity can bring information about the usage of software in the article. Coder can get this value by looking at the article and annotated figure/text that mention the usage of software in the paper.

Data constraint

schema:SoftwareAplication;

scs:SoftwareOutcome;

Obligation

Required

Repeatable

Yes

Usage notes

Example:

  • example for target schema:SoftwareApplication entity

schema:identifier “10.1088/0004-637X/726/1/55” ;

scs:citeSoftware <http://software-citation.org/sw/yt> ;

  • example for target scs:SoftwareOutcome entity blank node

schema:identifier “doi:/10.1088/0004-637X/726/1/55” ;

scs:citeSoftware [

a scs:SoftwareOutcome ;

schema:identifier <http://software-citation.org/sw/yt> ;

scs:outcome [

a scs:Outcome ;

schema:articleBody “Figure 1. Script that load” ;

scs:usageType “visualization” ;

]

]

Details about this SoftwareOutcome entity explained on the 2.10.1 SoftwareOutcome.

Entity types

2.9.1 scs:SoftwareOutcome

Entity properties

2.9.1.1 schema:identifier

2.9.1.2 scs:outcome

   2.9.1.2.1 scs:Outcome      

Subclass of

schema:Thing;

Definition

A class definition for the sub property scs:citeSoftware of Software Citation type. This class provides a schema that give a more detailed explanation  about the cited software and it outcome that stated in the paper/publication

Rationale

The usage of this class is to add more flexibility on defining the outcome for the usage of software that is presented on the paper / publication. This class is meant to be used as a blank nodes and present/link only to one software citation on the ScholarlyArticle entity.

Entity properties

2.9.1.1 schema:identifier

Definition

Link to the cited software application.

Rationale

At least one publication must have a SoftwareApplication citation to build the software citation corpus. This identifier must refer to the predefined SoftwareApplication entity.

Data constraint

schema:SoftwareAplication;

Obligation

Required

Repeatable

No

Usage notes

Example:

  • example for predefined SoftwareApplication entity

schema:identifier <http://software-citation.org/sw/yt>

Entity properties

2.9.1.2 scs:outcome

Definition

More detailed explanation about the usage of software in the article

Rationale

The scs:SoftwareOutcome entity can bring information about the usage of software in the article. To create a record, one can get this value by looking at the article and annotated figure/text that mention the usage of software in the paper

Data constraint

schema:Article

scs:OutCome

Obligation

Required

Repeatable

Yes

Usage notes

Example:

  • Example for target schema:Article entity

schema:SoftwareApplication [

a scs:SoftwareOutcome ;

schema:identifier <http://software-citation.org/sw/yt> ;

scs:outcome [

a schema:Article ;

schema:articleBody “Figure 1. Script that load”

]

]

  • Example for target scs:Outcome entity

schema:SoftwareApplication [

a scs:SoftwareOutcome ;

schema:identifier <http://software-citation.org/sw/yt> ;

scs:outcome [

a scs:Outcome ;

schema:articleBody “Figure 1. Script that load”

scs:usageType “visualization”

]

]

More details about scs:Outcome class explained on 5.2.1.2.1 Outcome

Entity types

2.10.1.2.1 scs:OutCome

Entity properties

2.10.1.2.1.1 schema:articleBody

2.10.1.2.1.2 schema:pageStart

2.10.1.2.1.3 scs:usageType

Subclass of

schema:Article

Definition

A sub entity of the scs:SoftwareOutcome. This class is a subclass of schema:Article which can define the body of the article, section and page where citation occurs. Besides that, there is a scs:usageType property that can help define the usage of the cited software application on the publication.

Rationale

So outcomes can be accurately described and found within a scholarly article. This is also where outcomes can be assigned a type, which could be used as a search term later on.

Entity properties

2.9.1.2.1.1 schema:articleBody

Definition

Part of the article that mentioned, use or cite the software

Rationale

This properties and entity text linked with this property can be a supporting statement for the publication that citing SoftwareApplication

Data constraint

schema:Text

Obligation

Required

Repeatable

Yes

Usage notes

Example:

schema:SoftwareApplication [

a scs:SoftwareOutcome ;

schema:identifier <http://software-citation.org/sw/yt> ;

scs:outcome [

a scs:Outcome ;

schema:articleBody “Figure 1. Script that load” ;

scs:usageType “visualization” ;

]

]

Entity properties

2.9.1.2.1.2 schema:pageStart

Definition

Part of the article that mentioned, use or cite the software

Rationale

One outcome property can have one pageStart which define the page location of the property. The page number is relative to the article page (not the whole volume page)

Data constraint

schema:Numeric

Obligation

Optional

Repeatable

Yes

Usage notes

Example:

schema:SoftwareApplication [

a scs:SoftwareOutcome ;

schema:identifier <http://software-citation.org/sw/yt> ;

scs:outcome [

a scs:Outcome ;

schema:articleBody “Figure 1. Script that load” ;

schema:pageStart 2 ;

scs:usageType “visualization” ;

]

]

Entity properties

2.9.1.2.1.3 scs:usageType

Definition

If a user links the softwareCitation to the scs:OutCome class, the entity must have at least the usageType to distinct the usage of this class with the schema

:SoftwareApplication class

Rationale

We want to understand the usage of the software in the publication. Is it used for visualization, computation, workflow, etc.

Strongly recommended to use control vocabulary

Data constraint

schema:Text ;

Use control vocabulary:

  • mention : if the article is just mentioning the software
  • development : if the article is about the development of the application
  • applied: if the article is providing the result or usage of the software, further can repeat the property to add more details of the applied/usage:
  • visualization
  • numerical computation
  • distributed computing
  • parallel processing
  • statistical method
  • machine learning
  • image processing
  • text mining
  • data cleaning
  • analytics platform

Obligation

Optional

Repeatable

Yes

Usage notes

Example:

schema:SoftwareApplication [

a scs:SoftwareOutcome ;

schema:identifier <http://softwarecitation.web.illinois.edu/sw/yt> ;

scs:outcome [

a scs:Outcome ;

schema:articleBody “Figure 1. Script that load” ;

schema:pageStart 2 ;

scs:usageType “visualization” ;

]

]

Entity properties

2.10 schema:url

Definition

URL or web page where we can find a Digital copy of the real scholarly article we are recording in this entity.

Rationale

In the future we want to presumably check the actual work (article) we annotate on this entity. This url will provide a limited provenance of the origin of an article.

Data constraint

schema:URL

Obligation

Optional

Repeatable

Yes

Usage notes

Example:

  • schema:identifier "doi:/10.1086/673276" ;

schema:url <https://www.journals.uchicago.edu/doi/abs/10.1086/673276> ;

Entity types

3. schema:SoftwareApplication

Entity properties

3.1 schema:name

3.2 schema:alternate

3.3 schema:description

3.4 schema:about

3.5 schema:relatedLink

3.6 schema:funder

3.7 schema:version

3.8 schema:license

3.9 schema:isBasedOn

3.10 schema:softwareRequirements

Subclass of

schema:SoftwareApplication

Definition

The software application class is mainly derived from the class schema:SoftwareApplication. This is an entity that defines the software that later will be attached to the ScholarlyArticle entity that uses this entity. The software can be defined as libraries, open source code packages, or application.

Rationale

The purpose to have SoftwareApplication entity for this metadata is to explain the usage of particular open source Software in the publication. It is strongly recommended to use this entity for Open Source software only.

Complete Example

<software/#inPho>

        a schema:SoftwareApplication ;

        schema:name "The InPhO Project"^^xsd:string ;

        schema:alternate "Internet Philosophy Ontology (InPhO) project"^^xsd:string ;

        schema:alternate "Indiana Philosophy Ontology"^^xsd:string ;

        schema:description """Indiana Philosophy Ontology (InPhO) project, which uses a combination of automated

methods and expert feedback to create a dynamic computational ontology for the discipline of philosophy"""^^xsd:string ;

        schema:about "local:/data mining"^^xsd:string ;

        schema:about "local:/natural language processing"^^xsd:string ;

        schema:about "local:/Expert Feedback"^^xsd:string ;

        schema:about "local:/Machine Reasoning"^^xsd:string ;

        schema:relatedLink <https://www.inphoproject.org> ;

        schema:funder "NEH"^^xsd:string ;

        schema:license "CC BY-NC-SA 3.0"^^xsd:string ;

        schema:softwareRequirements "browser" ;

        .

Entity properties

3.1 schema:name

Definition

Name of the Software / Application / Packages.

Rationale

A software must have a name in order to be known by the public and to exist in the code repository. Software Application names are typically unique.

Data constraint

Text

Obligation

Mandatory

Repeatable

No

Usage notes

It is strongly recommended to use the same name with the package name or common name for the software or from the repository.

For other repository, please specify the name of the repository and url in the download link.

Example:

  • schema:name “Yt” ;
  • schema:name “InPho” ;
  • schema:name “scikit-learn” ;

Entity properties

3.2 schema:alternate

Definition

Alternate name that is known by the public. Can also be the long name if the package name is an abbreviation. Any name that can provide more information or explain that this is the same name for this SoftwareApplication.

Rationale

Sometimes package name is not meaningful enough or not distinctive enough for some users. This alternate name can be a helpful information if there exists a name that properly known by the public. Besides that, a package name can also use abbreviation. The alternatives name can provide an additional information of the Library / Software.

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

One property of schema:alternate must define only one alternate name. If it has multiple alternate name then please define it by multiple properties. No ordinality preserves in the property.

Example:

  • schema:name “Yt” ;

schema:alternate “yours truly” ;

  • schema:name “Scikit” ;

schema:alternate “SciPy Toolkits” ;

Entity properties

3.3 schema:description

Definition

A free form sentence that describing the Software Application.

Rationale

This is a descriptive property that explain or provide more details about the Software Application. The information derived for this property can be derived from any sources. However, if exist, we strongly recommend to use the description information from the package repository.

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

For Python library, description can be found in the PyPi page. In the future, this value is ideally can be automated if the URL for the package is provided.

Example:

This is how you can derived the description from the PyPi page

  • schema:name “Yt” ;

schema:description “yt is an open-source, permissively-licensed python package for analyzing and visualizing volumetric data. Yt supports….” ;

Entity properties

3.4 schema:about

Definition

Keywords that can describe the usage or objective of the library

Rationale

Software Application might have a specific purpose of why it is developed. This property can give more information about the particular criterias of how this Software Application entity can be useful.

Data constraint

Text

Obligation

Optional

Repeatable

Yes

Usage notes

It is strongly recommended to use the control vocabulary, or use the keyword on the Package repository if it exists. For example, in Pypi repository for Python, the control vocabulary can be found  in the Topic column. This also can be automated if the url is given.

This is how we can get this value from the Python package manager:

  • Choose the respective package name, in this case yt 3.5.1

  • You can get the values on the topic section on the bottom left corner

Example:

  • Schema:name “Yt” ;

Schema:about “Scientific/Engineering :: Astronomy” ;

Schema:about “Scientific/Engineering :: Physics” ;

Schema:about “Scientific/Engineering :: Visualization” ;

Entity properties

3.5 schema:relatedLink

Definition

Homepage of the library/software or where we can find and download the software

Rationale

Provided url can be useful to derive some information that can be scrapped automatically. For example, the schema:about and schema:description information can be captured automatically if the url to the PyPi package is provided. The automated script is not provided in this schema.

Data constraint

schema:URL

Obligation

Optional

Repeatable

Yes

Usage notes

For the Python package this is the link to the PyPi page. For R is the link to the R repo page

For the Python Package you can get the name from the http://pypi.org as explained on the 3.4 section

For the R, you can get the name from: https://cran.r-project.org/web/packages/available_packages_by_name.html

Example:

  • Schema:name Yt ;

Schema:relatedLink <https://pypi.org/project/yt/> ;

  • Schema:name “glmnet” ;

Schema:alternate “glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models” ;

Schema:relatedLink <https://cran.r-project.org/web/packages/glmnet/index.html> ;

Entity properties

3.6 schema:funder

Definition

Funding agency or people that contribute for funding development of this software.

Rationale

In scholarly publications that require software, funding/grant is fundamental to the development, use and maintenance of the software.

Data constraint

Text

We strongly recommend to use organization name controlled vocabulary:

  • The National Science Foundation (NSF)
  • National Center for SuperComputing Applications (NCSA)
  • Moore Foundation
  • The National Institutes of Health (NIH) Office of Extramural Research

Obligation

Optional

Repeatable

Yes

Usage notes

Use organization name from control vocabulary or the Library of Congress Name Authority

Example:

  • schema:name “Yt” ;

schema:funder “NSF” ;

Entity properties

3.7 schema: version

Definition

Version of the Software Application. Is an optional element to provide more details if the software used on the publication refers to a specific version.

Rationale

Software Applications often have multiple versions and a publication might use a specific version for a reason or it is just simply the version that exist at the time the paper is developed / published. This can help with reproducibility of an outcome.

Data constraint

Text

We strongly suggest to follow the Python versioning or R versioning format if user using Python or R library.

Obligation

Optional

Repeatable

No

Usage notes

This is how you can get version number on R or Python

  • Python

The circled number on this yt PyPi page example (https://pypi.org/project/yt/)  is representing the latest version available on the repository. To see different versions of this particular pypi package, we can click the release history menu on the left. It will show the version and date when it is released.

  • schema:name Yt ;

schema:relatedLink <https://pypi.org/project/yt/> ;

schema:version “3.5.1” ;

For R:

We can get the version in the highlighted section on the R package page.

  • schema:name glmnet ;

schema:relatedLink <https://cran.r-project.org/web/packages/glmnet/> ;

schema:version “3.6-1” ;

Entity properties

3.8 schema:license

Definition

Define open source license that is applied on the Software Application. One software application by nature can have multiple open source license.

Rationale

Open source license is an important part of the Open source development and application because it determines how the open source can be published, reused, or modified by the users. Some licenses do not allow commercial use, some have the infectious attribute such as GPL which requires the user to apply GPL license to the application that uses the GPL software if it is published. Capturing this information is also important if we want to compare the distribution of open source licenses in the open source community if we have enough collection or able to automate this process in the future.

Data constraint

Text

We strongly suggest to only use values from the list of open source license : https://opensource.org/licenses/alphabetical

Obligation

Optional

Repeatable

Yes

Usage notes

To get this value from the PyPi package manager

  • Python

The circled part on this yt PyPi page example (https://pypi.org/project/yt/)  is showing the license attached to this package. For most of the packages that include license information we can find it on the left menu, Meta section.

  • schema:name Yt ;

schema:relatedLink <https://pypi.org/project/yt/> ;

schema:version “3.5.1” ;

schema:license “BSD License (BSD 3-Clause)”;

For R:

License can be found on the highlighted section on the R package page.

  • schema:name glmnet ;

schema:relatedLink <https://cran.r-project.org/web/packages/glmnet/> ;

schema:version “3.6-1” ;

Schema:license “GPL-2” ;

Entity properties

3.9 schema:isBasedOn

Definition

If a software application is derived or is referring to a specific version, or if the application is a submodule of a bigger project, this property will preserve the value and provide a link to the work/project associated with this SoftwareApplication entity.

Rationale

A SoftwareApplication project or library might have submodule or other greater works that related to this application.

Data constraint

schema:SoftwareApplication

Obligation

Optional

Repeatable

Yes

Usage notes

Use only another entity of SoftwareApplication that is already defined on the dataset.

Example:

  • schema:name “unyt” ;

schema:isBasedOn <http://software-citation.org/sw/yt>  ;

Entity properties

3.10 schema:softwareRequirements

Definition

A library must have requirement for a programming language used to run the library such as java, r, or python.

Rationale

This property explains how we can use the necessary component of SoftwareApplication entity to run or operate properly.

Data constraint

schema:Text : use control vocabulary

  •  java,
  • r,
  • python
  • browser: for web application

schema:SoftwareApplication: predefined entity of Software Application

Obligation

Optional

Repeatable

Yes

Usage notes

Example:

  • schema:name “yt” ;

schema:softwareRequirements “python” ;

  • schema:name “glmnet” ;

schema:softwareRequirements “r” ;

Entity types

4. SoftwareSourceCode

Entity properties

4.1 schema:codeRepository

4.2 schema:isBasedOn

4.3 schema:targetProduct

4.4 schema:description

4.4 schema:contributor

Subclass of

schema:SoftwareSourceCode

Definition

The SoftwareSourceCode class is mainly derived from the class schema:SoftwareSourceCode. This is an entity that define the location of the source code in the internet or open source repository such as Github, Gitlab, or SVN.

Rationale

This entity preserves information about the source code and later can be useful to track the source code based on the version of the SoftwareApplication. Besides that, this entity will later will be the parent property of contributors where we can look at the people that contribute to the application development based on the source code.

Complete Example

<repository/#inPho>

        a schema:SoftwareSourceCode ;

        schema:codeRepository <https://github.com/inpho/> ;

        schema:targetProduct <software/#inPho> ;

        schema:description "Internet Philosophy Ontology (InPhO) Project main repository. Contains several sub items related to the InPho project" ;

        schema:contributor <developer/#jaimieMurdock> ;

        schema:contributor <developer/#colinAllen> ;

        schema:contributor <developer/#kirtanSakariya> ;

        schema:contributor <developer/#sriramIyer> ;

Entity properties

4.1 schema:codeRepository

Definition

URL for the repository, where we can find the source code of a software application

Rationale

In the open source community, everyone can contribute to the development of the software. Bug fixing, feature improvement, is part of the development lifecycle. The repository URL can provide information that is needed to measure the contributions in the project

Data constraint

schema:URL

Obligation

Required

Repeatable

No

Usage notes

Example:

  • schema:name “yt” ;

schema:codeRepository <https://github.com/yt-project/yt> ;

Entity properties

4.2 schema:isBasedOn

Definition

Other software or Publication that inspired or has relation to this this software repository development

Rationale

For a software development based on publication, this can be linked to the predefined publication. For a group of developers that continue or fork another open source project can provide the SoftwareApplication or URL to other codeRepository.

Data constraint

schema:SoftwareApplication,

schema:Publication,

schema:SoftwareSourceCode,

schema:URL

We strongly recommend using the predefined SoftwareApplication, Publication or SoftwareSourceCode entities on the dataset that are strongly related to this source code or repository.

Obligation

Optional

Repeatable

Yes

Usage notes

Example:

  • schema:name “unyt” ;

schema:codeRepository <https://github.com/yt-project/unyt> ;

schema:isBaseOn <https://github.com/yt-project/yt> ;

This entity is explaining the repository unyt that is a work that based on yt.

Entity properties

4.3 schema:targetProduct

Definition

The targeted libraries, package or binary in which this source code used for or compiled.

Rationale

For a software development based on publication, this can be linked to the predefined publication. For a group of developer that continue or fork another open source project can provide the SoftwareApplication or URL to other codeRepository

Data constraint

schema:SoftwareApplication

Obligation

Required

Repeatable

No

Usage notes

This property must link to the predefined SoftwareApplication entity.

  • schema:name “yt” ;

schema:codeRepository <https://github.com/yt-project/yt> ;

schema:targetProduct <http://software-citation.org/sw/yt> ;

Entity properties

4.4 schema:description

Definition

Additional information of the repository, most of the time we can use the text describes the repository or use the repository title, or description

Rationale

Summary of content from the software repository. Often times we can get this value from the front page of repository url or from the README.md file on the repository.

Data constraint

schema:Text

Obligation

Optional

Repeatable

Yes

Usage notes

Example:

schema:description “A toolkit for analysis and visualization of volumetric data ….”

Entity properties

4.5 schema:contributor

Definition

Who are the people that contribute to the development of this software. Participating in the code development, future enhancement or bug fixing.

Rationale

Source code in open source community mostly have more than one contributors. Sometimes the contributor can also be the author if the software is based on publication, but most of the time the coders does not have association with publication as well. This property will link to the scs:CodeMaintainer who  are contributing to the source code to give proper attribution to the software maintainer, bug fixer, and developer.

Data constraint

scs:CodeMaintainer;

Obligation

Optional

Repeatable

Yes

Usage notes

For now, this property doesn’t preserve ordinality. Therefore it can be repeatable without maintaining the order of the contribution. This property must attached to a predefined scs:CodeMaintainer entity.

Example:

schema:contributor <http://software-citation.org/cm/mathewTurk>

Entity types

5. scs:CodeMaintainer

Entity properties

5.1 schema:identifier

5.2 schema:name

5.3 schema:sameAs

Subclass of

scs:Author

Definition

This entity represents contributor of particular source code

Rationale

Sometimes a coder / developer of open source applications does not related to the publication author. This entity will represent people that are working with the application code development but does not have relation with the publication authorship, we call this entity CodeMaintainer.

Complete Example

<developer/#jaimieMurdock>

        a scs:CodeMaintainer ;

        schema:identifier <https://github.com/JaimieMurdock> ;

        schema:name "Jaimie Murdock" ;

        schema:sameAs <author/#jaimieMurdock> ;

        .

Entity properties

5.1 schema:identifier

Definition

Identifier of the code maintainer.

Rationale

A code maintainer must have identifier related with the work that they has been done on the software. Because we are working on open source software, the code maintainer can be easily derived from the Repository of the related code location. For better preservation, we strongly recommend to use URL of the code maintainer from the repository. In the future, this value can automatically derived if the repository provide API call/function to get the list of code maintainer.

Data constraint

schema:URL

Obligation

Required

Repeatable

No

Usage notes

This is how you can get the CodeMaintainer value from the application repository.

From the project page, click the contributors tab on the top right

You will get the detail contributors page, and you can click the link on the name

The name contains url that redirect the browser to the user page. Use this URL for the contributor ID

Example:

<cm/#JamieMurdock>

        a scs:CodeMaintainer ;

        schema:identifier <https://github.com/JaimieMurdock>;

Entity properties

5.2 schema:name

Definition

Name of the code maintainer

Rationale

Code maintainer can have different name propose on their repository user page. We strongly recommend to use this property to represent the display name on the user page if there is no information about the real name

Data constraint

schema:Text

Obligation

Required

Repeatable

Yes

Usage notes

Same with the identifier, this value should be derived from the user page

Example:

<cm/#JamieMurdock>

        a scs:CodeMaintainer ;

        schema:identifier <https://github.com/JaimieMurdock>;

schema:name “Jamie Murdock” ;

Entity properties

5.3 schema:sameAs

Definition

If this developer is also an Author of a publication, use this property to link the contributor to Author entity

Rationale

Sometimes if the open source code is based on publication, we should have a link to stitch the application developer and authorship of the publication. This property will preserve that information in which we can provide information and see some intersection when we want to query the data.

Data constraint

scs:Author

Obligation

Optional

Repeatable

Yes

Usage notes

<cm/#JamieMurdock>

        a scs:CodeMaintainer ;

        schema:identifier <https://github.com/JaimieMurdock>;

schema:name “Jamie Murdock”  ;

schema:sameAs <ra/#JamieMurdock> ;


Test Corpus / Project Proposal

  1. Allen, Colin, and the InPhO Group. "Cross-Cutting Categorization Schemes in the Digital Humanities." Isis 104.3 (2013): 573-583. https://www.journals.uchicago.edu/doi/abs/10.1086/673276

The first publication about InPHO, talks about why and how they developed this software (purpose and goal).

Example: http://softwarecitation.web.illinois.edu/corpus/article/#inphoarticle1

  1. Murdock, Jaimie, Jiaan Zeng, and Robert H. McDonald. "Topic exploration with the htrc data capsule for non-consumptive research." Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM, 2015. https://dl.acm.org/citation.cfm?id=2756929

Combining Hathi Trust Research Center (HTRC) data capsule and InPho software will be a good use case of associating two different open source packages in one publication.

Example: http://softwarecitation.web.illinois.edu/corpus/article/#inphoarticle2

  1. Murdock, Jaimie, and Colin Allen. "InPhO for all: why APIs matter." Journal of the Chicago Colloquium on Digital Humanities and Computer Science. Vol. 1. No. 3. 2011.

https://www.researchgate.net/publication/267806761_InPhO_for_All_Why_APIs_Matter

This paper talks about the usage of one particular function in the open-source library.

Example: http://softwarecitation.web.illinois.edu/corpus/article/#inphoarticle3

  1. Buckner, Cameron, Mathias Niepert, and Colin Allen. "From encyclopedia to ontology: Toward dynamic representation of the discipline of philosophy." Synthese 182.2 (2011): 205-233. https://link.springer.com/article/10.1007/s11229-009-9659-9

This paper talks about the development community of this project.

Example: http://softwarecitation.web.illinois.edu/corpus/article/#inphoarticle4

  1. Holtzman, Benjamin, et al. "Seismic sound lab: Sights, sounds and perception of the earth as an acoustic space." International Symposium on Computer Music Multidisciplinary Research. Springer, Cham, 2013. https://link.springer.com/chapter/10.1007/978-3-319-12976-1_10 

Summary

This data dictionary was created to add support for scientific software citations with the intention of crediting all humans, entities and software involved. We aimed to create an RDF structured schema that would connect all components accurately.