Skip to main content

Concepts and Data Structures

Introduction

In this section we introduce fundamental concepts and corresponding data structures that are used by Kyla AI.

Basic Concepts

In Kyla API we represent concepts as nodes. Any disease, symptom or other concept is represented in form of a node, which can have states that are often referred as outcomes.

For example, a hypertension is a medical condition which has states present or absent. In the API we additionally always provide an option that corresponds to lack of knowledge: unknown which would be equivalent to answering the question I don't know.

In Kyla we group concepts into broad categories such as conditions, risk factors, symptoms and lab tests. A brief description of each category is provided below:

  • Conditions - are basically medical conditions or diseases. Conditions always have two possible states: present or absent.
  • Risk Factors - are any factors that influence likelihood of a disease. Example would be smoking which is a risk factor for lung cancer. Note, that a risk factor only makes sense in a context of a disease.
  • Symptoms - are observable representations of a medical conditions that can help in the diagnostic process. A typical examples of symptoms would be fever or cough.
  • Lab Tests - similar to symptoms, however they correspond to specialist medical tests such as alanine transaminase (ALT) blood test.
note

These categories are NOT mutually exclusive. One node can belong to more than one category -- for example hypertension is a condition, but as well it can be a risk factor for some other conditions.

Node Definition

A node represents a concept, it can be a medical concept (e.g. hypertension) or more general (e.g. education). Each node has a unique identifier (node id) which is one of the key means of accessing information in Kyla API. A node id is a string that should start with a capital letter and then is followed by 5 digits. For example, Hypertension has node id C00020.

Each node has the following definition:

  • id - a unique identifier of a node
  • title - a user-friendly name of a concept represented by the node
  • outcomes - a list of possible states of the node. Each outcome is defined by:
    • outcome - a unique identifier of an outcome (within a node)
    • label - a user-friendly description corresponding to the outcome
  • type - determines if it is a DISCRETE or CONTINUOUS node
  • unitOfContinuousType - only relevant to continuous nodes, determines a unit of measurement (e.g. day or mg)

Kyla AI allows for two types of node definitions: DISCRETE and CONTINUOUS. The continuous nodes still have discrete outcomes that correspond to value ranges, however they allow additionally for accepting numeric values. It is required for a continuous node to have unit of measurement defined.

Example of a discrete node definition:

{
"id": "C00020",
"title": "Hypertension",
"outcomes": [
{
"outcome": "present",
"label": "Yes"
},
{
"outcome": "absent",
"label": "No"
}
],
"type": "DISCRETE",
"unitOfContinuousType": null
}

Example of continuous node definition:

{
"id": "N05865",
"title": "Low Density Lipoproteins (LDL)",
"outcomes": [
{
"outcome": "unknown",
"label": "Don't know"
},
{
"outcome": "a70_160",
"label": "a70_160"
},
{
"outcome": "a160_over",
"label": "a160_over"
},
{
"outcome": "aless_70",
"label": "aless_70"
}
],
"type": "CONTINUOUS",
"unitOfContinuousType": "mg/dL"
}

Evidence Definition

In Kyla API information about a patient is stored in a data structure called evidence. Two facts about the patient are always required: age and sex. Any other information about the patient is provided in form of observations, that effectively are key-value pairs, where the key is a node identifier and the value is a state.

The Evidence structure is defined as follows:

  • sex - sex of a patient with two possible options: MALE and FEMALE.
  • age - age of a patient expressed in whole years (e.g. 35)
  • observations - a list of observations
    • nodeId - a node identifier
    • outcome - observed state of a node from the list of outcomes specific for that nodeId
note

Order of the observations is arbitrary and does not affect results of queries.

Example of evidence for a 35-year-old patient who is an active smoker (N00004), decided not to disclose his drinking habits (N00007) and with diagnosed hypertension (C00020):

{
"sex": "MALE",
"age": 35,
"observations": [
{
"nodeId": "C00020",
"outcome": "present"
},
{
"nodeId": "N00004",
"outcome": "active_smoking"
},
{
"nodeId": "N00007",
"outcome": "unknown"
}
]
}