KAI explained

Definitions

Purpose

Kertos is developing a SaaS solution in the field of data protection, information security, and compliance (hereinafter referred to as "Kertos platform"). As part of various innovation grants, several advanced features are being developed for Kertos platform by Kertos, including the proprietary artificial intelligence called KAI (Kertos Artificial Intelligence). KAI is designed to further reduce the workload for users of Kertos platform and provide targeted assistance and answers to their inquiries.

Definition

Within this document, Kertos apply the ISO/IEC 22989s definition of Artificial Intelligence System (AI System), i.e.: engineered system that generates outputs such as content, forecasts, recommendations or decisions for a given set of human-defined objectives "

Implementation

In our application, AI is implemented in the form of a copilot, leveraging generative AI—such as large language models (LLMs)—to assist users with complex cognitive tasks. The copilot works as an intelligent assistant, enhancing user productivity by providing relevant insights, suggestions, and actions.

The main characteristics of a copilot include:

Natural Language Input: Users interact with the copilot using natural language commands or even code, making the experience intuitive and accessible.

Human Control: While the copilot provides suggestions and outputs, the final decisions, guidance, and approval rest with the human users, ensuring that the AI complements rather than replaces human judgment.

Scalability with Complexity: The copilot becomes increasingly valuable as task complexity rises, offering more sophisticated and contextually relevant outputs in response to more detailed or intricate queries.

Application of AI / LLM

Brief description of the application, model or service

At the core of this application is a sophisticated language model (LLM). The model processes inputs by identifying patterns and applying algorithms. This combination allows the AI to structure information coherently and offer well-organized content to end-users. The LLM enables real-time analysis and processing of large volumes of data, ensuring that each learning unit is contextually relevant and aligned with the industrial training objectives of the users.

KAI (Kertos AI), the AI-driven assistant within Kertos platform, serves as an intelligent guide for users, dynamically providing accurate and timely responses. By utilizing LLM techniques, KAI interprets complex queries, especially in the context of compliance and certification preparation, and presents detailed, contextually appropriate answers.

What does the application enable?

KAI (Kertos AI) provides all users of Kertos platform with intelligent guidance and detailed, sector-specific responses. It is designed to help customers prepare for data privacy and information security standards(e.g., ISO 27001, GDPR, etc), making the process more efficient by offering real-time, accurate answers related to compliance. This reduces the need for manual research, accelerating certification readiness.

In which domain is the application used and where is it hosted?

KAI operates within the data protection, cybersecurity, and compliance sectors, supporting organizations in managing regulatory requirements and preparing for certifications. The application is securely hosted on European servers using AWS infrastructure, ensuring compliance with European data protection regulations, including GDPR. In the future, it may be possible that data will also be transferred to AWS servers in the USA due to the addition of further functionalities. This transfer is legitimized by an adequacy decision according to Art. 45 GDPR, since AWS is certified according to the EU US Data Privacy Framework.

Organization

Which organisation provides the software?

The application is provided by Kertos GmbH, Klosterhofstraße 6 80331 München. The AI technology that powers the application is sourced from Amazon Bedrock, utilizing advanced models such as Claude for its AI-driven functionalities. This partnership ensures robust and scalable AI capabilities to meet the compliance and cybersecurity needs of users.

How established is the company and what references does it have?

Kertos GmbH was founded by Johannes Hussak, Kilian Schmidt, and Alexander Prams to revolutionize data protection and compliance automation. As part of various innovation grants, Kertos GmbH is developing several advanced features for its SaaS platform, Kertos OS, including the proprietary Kertos Artificial Intelligence (KAI). KAI is designed to further streamline workflows for users, providing targeted, automated assistance and guidance, especially in the areas of data protection and compliance.

Which services, models or algorithms are used from the portfolio?

The KAI platform integrates several advanced AI models and services to enhance its features and provide a seamless user experience. The key services and models utilized include:

Amazon Bedrock Service: A fully managed service that allows seamless integration of multiple foundational models into the platform.

Claude 3 by Anthropic: A sophisticated language model from Anthropic, used for its ability to handle complex queries and provide detailed, context-aware responses.

Claude 3.5 by Anthropic: A more advanced iteration of Claude, offering enhanced capabilities for understanding and responding to user queries with improved precision and depth.

Model comparison table on Claude 3, and Claude 3.5 by Anthropic:

	Claude 3.5 Sonnet	Claude 3 sonnet
Description	Most intelligent model	Balance of intelligence and speed
Strengths	Highest level of intelligence and capability	Strong utility, balanced for scaled deployments
Multilingual	yes	yes
Vision	yes	yes
API model name	claude-3-5-sonnet-20240620	claude-3-sonnet-20240229
API format	Messages API	Messages API
Max output	8192 tokens	4096 tokens
Training data cut-off	Apr 2024	Aug 2023

For more details, you can view the full documentation here.

What features/characteristics do the respective services have?

The key features of the integrated services include:

Language Processing: These models excel in understanding and interpreting natural language, allowing for context-aware and accurate responses to user queries.

Text Generation: Both Claude 3 and Claude 3.5 are designed for advanced text generation, enabling them to produce detailed, coherent responses tailored to user needs, particularly in compliance and cybersecurity contexts.

Contextual Understanding: The models are fine-tuned to manage complex, industry-specific queries, ensuring that responses align with regulatory requirements.

These features ensure that KAI provides accurate, relevant, and efficient interactions for users.

Parameters of KAI is as follows:

temperature = 1

top_k=250

top_p=1

Algorithms and Data

Which specific algorithms are used?

The application utilizes Claude 3 and Claude 3.5 by Anthropic, both based on advanced language models for natural language processing and text generation.

What data was used to train the algorithms?

Claude 3 and Claude 3.5 were trained on large, publicly available text datasets from the internet. However, the specific details of the training data are not disclosed. As of September 27 2024, Claude 3 Opus and Claude 3 Haiku are trained upto August 2023 and Sonnet was trained upto April 2024.

Claude uses publicly available data from the internet, third party datasets acquired from suppliers, data provided from users. For further reference: https://support.anthropic.com/en/articles/7996885-how-do-you-use-personal-data-in-model-training.

Are there any further data provided?

In addition to user-provided media content, the application uses data related to ISO 27001:2022 and GDPR standards to enhance compliance and regulatory support.

Are there reference implementations (e.g., open source)?

KAI primarily utilizes proprietary models, such as Claude 3 and Claude 3.5 by Anthropic, and does not rely on open-source implementations. These models are integrated through Amazon Bedrock for text generation and natural language processing capabilities. Therefore, no direct reference implementations are available as open-source for these models.

What happens to the data a user is providing to KAI?

KAI does not use user data or allow others to use user data to train the machine learning models used to provide KAI.

Artificial intelligence and machine learning models can improve over time to better address specific use cases. We may use data we collect from your use of KAI to enhance our models when you (i) voluntarily provide Feedback to us, such as by labeling Output with a thumbs up or thumbs down, or (ii) give us your explicit permission.

We will retain your personal data only for as long as necessary to provide our services to you, or for other legitimate business purposes, such as resolving disputes, ensuring safety and security, or complying with legal obligations. The duration for which we keep personal data will depend on various factors, including:

- The purpose for processing the data (e.g., whether it’s needed to deliver our services);

- The amount, nature, and sensitivity of the data;

- The potential risk of harm from unauthorized use or disclosure;

- Any legal requirements that apply to us.

Data Collected	Retention Period
Model Used	1 year
Prompt data: context from db	1 year
Prompt data: query user enters	1 year
Output Data	1 year
User Feedback	1 year

Purpose

What people-based decision should be supported by the application?

The application supports decision-making to ensure compliance with ISO 27001:2022 and GDPR standards in customer businesses. However, responsibility for ensuring the completeness, consistency, and correctness of the data remains with the issuer.

For what purpose is the application used?

The application is designed to guide and support users of Kertos platform by providing information related to privacy and information security topics. It helps streamline processes and ensures compliance with key regulatory standards.

In which process is the software integrated?

KAI is integrated within the no-code SaaS platform Kertos, assisting users in managing privacy and security frameworks.

Which people/profiles use the application?

KAI is primarily used by:

Legal departments

Data protection officers

Quality managers

Subject matter experts

Information security experts

How do the users use the software?

Users leverage Kertos platform to organize and manage their privacy and information security processes, monitor compliance, and handle risk-related issues efficiently, using KAI as a guide throughout the platform’s features.

Maturity Level

Roll-Out: Beta launch

The application is currently in its beta phase, allowing select users to test core functionalities and provide feedback for improvement.

Application

Product

KAI, the AI-driven assistant integrated into the Kertos platform, is designed to help users manage privacy, information security, and compliance processes.

Performance Indicators, Deployment and Monitoring

How are the results Monitored ?

Monitoring KAI’s outputs and inputs provided by users, including the user data mentioned above, will be monitored to achieve organization wide AI System objectives. Deployment of the KAI is done after the testing measures and achievement of appropriate metrics defined in the policies.

KAI is developed in an isolated development environment and will be deployed to the production environment on AWS after testing procedures and will be provided to the users at app.kertos.io.

The performance of KAI is evaluated by a human expert responsible for creating and overseeing the training. This expert assesses the accuracy, relevance, and utility of the AI’s responses to ensure that they meet the required standards and compliance objectives. Additionally, feedback from users during the beta phase is incorporated to refine the system.

Which KPI are used for the evaluation?

Correctness of the content created and the test questions.

Metric		Description
Factual Correctness		Evaluates whether the information provided is accurate and aligns with the principles and requirements of ISO 27001:2022. The responses should be fact-based, reflecting a deep understanding of the standard.
Relevancy	Assesses whether the model’s responses are directly applicable to the queries regarding ISO 27001:2022. The information should be specific to the context of information security management and the ISO standard, avoiding unrelated details. For example: If a user asks, "What are the key objectives of ISO 27001:2022?", KAI should respond with information directly related to the objectives outlined in the standard, such as "The primary objectives of ISO 27001:2022 include establishing, implementing, maintaining, and continually improving an information security management system (ISMS).” If KAI responds with information about GDPR instead, this would be considered irrelevant as it doesn't pertain to the user's specific query about ISO 27001:2022.	Assesses whether the model’s responses are directly applicable to the queries regarding ISO 27001:2022. The information should be specific to the context of information security management and the ISO standard, avoiding unrelated details. For example: If a user asks, "What are the key objectives of ISO 27001:2022?", KAI should respond with information directly related to the objectives outlined in the standard, such as "The primary objectives of ISO 27001:2022 include establishing, implementing, maintaining, and continually improving an information security management system (ISMS).” If KAI responds with information about GDPR instead, this would be considered irrelevant as it doesn't pertain to the user's specific query about ISO 27001:2022.
Completeness		Ensures that the model's answers cover all aspects of the question, particularly focusing on key elements of ISO 27001:2022. The response should not omit critical information related to compliance, risk management, and security controls
Response Time		The longest answers should not exceed 50 seconds.
Conciseness	The model’s responses should be concise and to the point, avoiding unnecessary elaboration. Responses should be clear, avoiding extraneous details and ensuring that complex topics are communicated in a focused manner. For example: For a query like, "What is an ISMS?", a concise response could be: "An ISMS, or Information Security Management System, is a set of policies and procedures designed to manage sensitive data systematically. It helps organizations manage security risks by implementing controls to protect the confidentiality, integrity, and availability of information.”	The model’s responses should be concise and to the point, avoiding unnecessary elaboration. Responses should be clear, avoiding extraneous details and ensuring that complex topics are communicated in a focused manner. For example: For a query like, "What is an ISMS?", a concise response could be: "An ISMS, or Information Security Management System, is a set of policies and procedures designed to manage sensitive data systematically. It helps organizations manage security risks by implementing controls to protect the confidentiality, integrity, and availability of information.”
Potential Cyberattacks		Cyber attacks such as: Evasion attacks, which occur after an AI system is deployed, attempt to alter an input to change how the system responds to it. Examples would include adding markings to stop signs to make an autonomous vehicle misinterpret them as speed limit signs or creating confusing lane markings to make the vehicle veer off the road. Poisoning attacks occur in the training phase by introducing corrupted data. An example would be slipping numerous instances of inappropriate language into conversation records, so that a chatbot interprets these instances as common enough parlance to use in its own customer interactions. Privacy attacks, which occur during deployment, are attempts to learn sensitive information about the AI or the data it was trained on in order to misuse it. An adversary can ask a chatbot numerous legitimate questions, and then use the answers to reverse engineer the model so as to find its weak spots — or guess at its sources. Adding undesired examples to those online sources could make the AI behave inappropriately, and making the AI unlearn those specific undesired examples after the fact can be difficult. Abuse attacks involve the insertion of incorrect information into a source, such as a webpage or online document, that an AI then absorbs. Unlike the aforementioned poisoning attacks, abuse attacks attempt to give the AI incorrect pieces of information from a legitimate but compromised source to repurpose the AI system’s intended use. Will be monitored to protect and improve the AI system to be more resilient to potential threat vectors and actors.

Ethics and Risk Considerations

How is bias made transparent in the data?

Bias in AI refers to systematic discrepancies that may arise in the results due to various sources, such as data collection, algorithm design, or human interpretation. In this application, bias is made transparent to users through explicit notices, informing them of potential biases in the data or model output. This ensures that users are aware of any factors that may affect the results and can interpret them accordingly.

How do you rate the potential for abuse?

The potential for abuse is rated as low.

Does the data set identify any subpopulations (e.g., by age, gender)?

No, the dataset used does not explicitly identify subpopulations based on characteristics like age or gender. It does not include personally identifiable information related to natural persons, which helps mitigate risks related to privacy or bias against specific groups.

How are customers trained to interpret algorithms?

Customers are provided with clear, understandable answers from the system. These answers are designed to be self-explanatory, allowing users to interpret them easily without requiring extensive technical knowledge. Additional training or guidance is available if needed to help customers make the most of the AI-generated responses.

Are there any environmental impacts?

Environmental impacts are limited through usage of AWS infrastructure. Amazon is committed to reaching net-zero carbon emissions by 2040. More information regarding Amazon’s sustainability can be found here: https://sustainability.aboutamazon.com/products-services/the-cloud?energyType=true.

Are there other risks; what is the expected damage (examples: safety, fairness / non-discrimination, privacy, security)?

The main risks associated with this application stem from the general risks inherent in AI, particularly Large Language Models (LLMs). These include:

Data use for training: Input data might be used unknowingly to train the LLM, which could lead to unintended learning outcomes.

Model drift: Incorrect or biased inputs can cause the LLM to produce inaccurate or inappropriate outputs over time.

Hallucinations: The model may generate content that is factually incorrect or misleading, a phenomenon known as "hallucinations."

These risks highlight the need for users to critically assess the outputs of the AI and implement proper safeguards to prevent misuse.

Risk Classification based on EU AI Act

The articles and annexes of the EU AI Act were decisive for the classification:

AI Act Classification Questions	Answers
Is the AI System part of Prohibited AI practises listed in Article 5?	KAI’s operations does not have any subliminal techniques or biometric capabilities. KAI generates outputs depending on the prompts utilized by the user.
Is the AI system intended to be used as a safety component of a product, or the AI system is itself a product, covered by the Union harmonisation legislation listed in Annex I?	KAI is not part or any product listed in harmonization legislation. KAI is also not a safety component of a product listed in the harmonization legislation.
Is the AI system used in any of the high-risk domains according to Article 6 in Annex III: Biometrics Critical infrastructure Education and vocational training Employment, workers management and access to self-employment Access to essential private services and essential public services and benefits Law enforcement Migration, asylum and border control management Administration of justice and democratic processes?	KAI is not used in any of these areas. It is used as a chatbot with advisory capabilities helping individuals with their questions in the field of Information Security and Privacy. It is also incapable of acting in on itself.
Does the AI System: Interact with natural persons? Can generate or manipulate media (videos, texts, images, etc.)? Can detect emotions, feelings or other biometric capabilities?	KAI does interact with natural persons and can generate textual outputs based on user inputs. However, it does not detect emotions, feelings, or process biometric data. As a system that interacts with users, KAI is classified under Article 52 for AI systems interacting with natural persons.

This classification positions KAI as a non-high-risk AI system that is subject to transparency requirements, as per Article 52 of the AI Act, but not considered high-risk or prohibited under the regulation.

Conclusion of internal risk assessment

The internal risk assessment carried out by Kertos concludes that KAI does not fall into the "High Risk" category as defined by the “Proposal for a Regulation of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative acts (here: "EU AI Act").