Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All $refs must use #/$defs/{def_name} syntax rather than {def_name}.yaml/json #3369

Closed
3 tasks done
Veetaha opened this issue May 16, 2024 · 4 comments · Fixed by #3430
Closed
3 tasks done

All $refs must use #/$defs/{def_name} syntax rather than {def_name}.yaml/json #3369

Veetaha opened this issue May 16, 2024 · 4 comments · Fixed by #3430

Comments

@Veetaha
Copy link
Contributor

Veetaha commented May 16, 2024

Clear and concise description of the problem

Right now the following simple TSP declaration

@jsonSchema("https://my-domain.ua")
model Foo {
    bar: Bar
}

model Bar {}

produces this JSON schema with tsp compile .

$schema: https://json-schema.org/draft/2020-12/schema
$id: https://my-domain.ua
type: object
properties:
  bar:
    $ref: Bar.yaml
required:
  - bar
$defs:
  Bar:
    $schema: https://json-schema.org/draft/2020-12/schema
    $id: Bar.yaml
    type: object
    properties: {}

A single file with the schema is output, which is cool and convenient. However the $refs point to non-existent files. I understand that each object in $defs also gets an $id with the file name that $refs point to, however all VSCode extensions that I use for providing completions and docs with JSONSchema don't support this kind of intra-file references. They all think as if Bar.yaml is supposed to be an external schema and try to load that file instead.

The extensions that I tested it with which don't support this style of references are:

Expected

The $refs should all use the syntax #/$defs/{def_name} instead. This is what extensions usually expect when it comes to intra-schema-file references.

I don't know why {def_name}.yaml/json syntax for references was chosen by default, but if it's really useful in some use cases, then I propose to make the behavior configurable. TSP should be able to generate #/$defs/{def_name} via the emitter options, for example.

Workaround

Right now I'm working around this problem with this custom decorator that I put at the top level type definition in my JSON schema. It overrides the default $id of the generated definitions to use the desired syntax. However, this is quite a hack.

/**
 * @param {import("@typespec/compiler").DecoratorContext} context
 * @param {import("@typespec/compiler").Type} target
 */
export function $workaround(context, target) {
    traverseTypes(target, type => {
        switch (type.kind) {
            case "Enum":
            case "Union":
            case "Scalar":
            case "Model": {
                const id = jsonSchema.getId(context.program, type);
                if (id == null && type.name != null) {
                   jsonSchema.$id(context, type, `#/$defs/${type.name}`);
                }
            }
        }
    })
}

/**
 * @param {import("@typespec/compiler").Type} type
 * @param {(type: import("@typespec/compiler").Type) => void} visit
 */
function traverseTypes(type, visit) {
    visit(type);
    const recurse = (type) => traverseTypes(type, visit);

    switch (type.kind) {
        case "Model": {
            for (const property of type.properties.values()) {
                recurse(property);
            }
            if (type.indexer != null) {
                recurse(type.indexer.key);
                recurse(type.indexer.value);
            }
            break;
        }
        case "ModelProperty": {
            recurse(type.type);
            break;
        }
        case "Tuple": {
            for (const element of type.values) {
                recurse(element);
            }
            break;
        }
        case "Union": {
            for (const variant of type.variants.values()) {
                recurse(variant);
            }
            break;
        }
        case "UnionVariant": {
            recurse(type.type);
            break;
        }
        case "Enum": {
            for (const member of type.members.values()) {
                recurse(member);
            }
            break;
        }
    }
}

With this workaround decorator that I place on top of Foo model I get the following output:

$schema: https://json-schema.org/draft/2020-12/schema
$id: https://my-domain.ua
type: object
properties:
  bar:
    $ref: "#/$defs/Bar"
required:
  - bar
$defs:
  Bar:
    $schema: https://json-schema.org/draft/2020-12/schema
    $id: "#/$defs/Bar"
    type: object
    properties: {}

Checklist

  • Follow our Code of Conduct
  • Read the docs.
  • Check that there isn't already an issue that request the same feature to avoid creating a duplicate.
@markcowl
Copy link
Contributor

  • Potentially provide a flag, but make this the default

@markcowl markcowl added this to the [2024] June milestone May 20, 2024
@markcowl
Copy link
Contributor

pri: 2
est: 5

@bterlson
Copy link
Member

Some background and info:

The JSON Schema emitter produces valid JSON Schema 2020-12. When the emitter needs to combine multiple schemas into a single file, either because the user provided a bundleId to the emitter, or because a type references another type outside a JSON Schema namespace (as is the case above), it does this by producing a bundle as defined in the JSON Schema 2020-12 specification. This is to ensure that you always get consistent behavior whether or not you choose to bundle.

I have found that VSCode does not support JSON Schema 2020-12. This emitter implements the bundling behavior added in that specification. Even with using json pointer style references, VSCode will not provide completions for authoring 2020-12 schemas.

I think the reference approach when passing bundleId needs to stay the way it is. That's a bundle, it should follow the standard. However, I believe it is wrong to produce a bundle for the situation in this issue. Looking at the sample above, we create a bundle that implies the existence of https://my-domain.ua/Bar.yaml, which is incorrect based on the TypeSpec where Bar does not exist in that namespace. This is why we don't emit it as a separate schema file in the first place. And so we shouldn't bundle that schema either.

I think the correct fix here is the following:

  • When a type references another type outside a JSON Schema namespace, include the referenced type under $defs (as we do today)
  • Such referenced types do not have a $id or $schema field
  • Such referenced types are referenced via JSON pointers not ids
  • When passing a bundleId, such referenced types are not included in the $defs of the bundle, but are included in the bundled schema's $defs for each time they are referenced (in order to comply with the specification requirements for not altering references of bundled schemas).

What this would mean is that for this specific case, where referenced types outside of the JSON Schema namespace are included in the same file, the reference would use JSON Pointers and tooling would work in VSCode. However, when passing bundleId, the bundled schemas would still use standard referencing by id and would not work in VSCode. @Veetaha would this work for your use case?

@Veetaha
Copy link
Contributor Author

Veetaha commented May 22, 2024

@Veetaha would this work for your use case?

Yes, it would work for me. Thank you for clarifying the bundling approach. Now it makes sense to me. The behavior when using bundleId should stay the same. What needs altering is the behavior for types that are (recursively) referenced as part of a @jsonSchema-anotated type. They could use the JSON pointer syntax for references and be compatible with older versions of JSON schema standards.

github-merge-queue bot pushed a commit that referenced this issue May 24, 2024
Fixes #3369 by changing how types are bundled. In particular, when a
type is not a JSON Schema type, we never create a root schema for it.
Instead, it is inlined into the defs of any schema which references it,
and referenced using a JSON pointer. This PR makes bundling have
essentially no impact on emitted schemas, and is merely a way to bundle
them into a single file.

The approach is as follows:

* When a type references another type outside a JSON Schema namespace,
include the referenced type under $defs:
   * Such referenced types do not have a $id or $schema field
   * Such referenced types are referenced via JSON pointers not ids
* Bundling does not alter the bundled schemas or introduce new root
schemas. This changes two things from what we do today:
* The `$id` of the bundled schemas now includes the file path as it does
for non-bundled schemas (whereas before it was just the type name)
* non-JSON Schema types do not get $defs in the bundle, so the bundle
has the same root schemas as would be written to disk when not bundling.
  
In terms of implementation, the basic approach is to not handle bundling
via the emitter framework source files. Instead, we always create source
files for root schemas, and inline the necessary defs as we did before
(but now using JSON pointers). Then when we're about to write source
files, if we're bundling we assemble the bundle and emit that single
file, otherwise we emit each source file that contains a root schema.

Todo: 

* [ ] Validate that the bundled schemas continue to work with ajv.
* [ ] Cleanups

---------

Co-authored-by: Vitalii Kryvenko <gersoh3@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants