DESRIST2026

Proceedings

Paper podcasts

Five-minute AI-generated audio briefings and summaries for the papers. Search, press play, and share.

Type
Track
85 papers
Reputation as a Sociotechnical Design Problem: A Social Systems Theory Lens for Business Reputation Systems
DESRIST 2026 · Part I

Reputation as a Sociotechnical Design Problem: A Social Systems Theory Lens for Business Reputation Systems

Ulvi Ibrahimli, Simon Hemmrich and Axel Winkelmann
🎤 Liv Knowles & Leo Kant
This study applies Niklas Luhmann's social systems theory to conceptualize blockchain-based business reputation systems as integrated sociotechnical systems. The authors propose a design lens that aligns social layers (observation, selection, communication, system trust, and elements/relations) with technical layers. The methodology uses conceptual design research, illustrated and analyzed through a stylized market scenario.

The problem. Existing research on reputation systems is predominantly technical, neglecting how human interpretation, social behaviors, and incentives shape system outcomes. As a result, current blockchain-based solutions lack an integrated conceptual foundation, making it difficult to resolve social challenges like fake ratings, lack of feedback incentives, and reputation inflation. This gap prevents these systems from establishing trust beyond their technical infrastructure.

Key findings.
– The study conceptualized a social design layer for reputation systems based on Luhmann's systems theory, aligning social and technical mechanisms (conceptually supported).
– The role of the social layer was illustrated through a market scenario where tipping serves as a voluntary reputation signal and rating disclosure serves as an economic incentive (conceptually supported).
– The study does not contain empirical testing, statistical analysis, or quantitative validation of the proposed framework, which remains a key limitation (no statistical support).
– The conceptual nature of the study introduces limitations, such as a high level of abstraction that constrains immediate practical guidelines and assumes buyers are consistently willing to tip.
What it means for you
  • CIO / IT Executive: Schedule a 9:00 AM meeting with your Lead Enterprise Architect to review the architecture of all active blockchain or digital trust projects, and mandate the integration of a 'Social Layer' assessment to map how user interpretation, system trust, and communication dynamics align with the technical infrastructure.
  • IT Manager: Gather your software development and UX design teams to plan a sprint dedicated to prototyping a dual-signal reputation mechanism, specifically pairing standard ratings with a voluntary tipping feature to test if it reduces reputation inflation and fake ratings.
  • Business Strategist: Draft a business case for a pilot incentive model in your platform's marketplace that uses rating disclosures as economic rewards for service providers, testing whether voluntary buyer tips can serve as a reliable, high-trust reputation signal.
  • Researcher: Write a research proposal to design and execute a controlled behavioral experiment that empirically tests the paper's key assumption: whether platform buyers are actually willing to consistently tip as a voluntary reputation signal.
  • Policymaker: Draft an internal policy memorandum requiring that any future public sector procurement guidelines for decentralized ledger or identity platforms must include a 'Sociotechnical Impact Assessment' to address social risks like feedback manipulation and rating inflation.
Reputation Systems, Systems Theory, Social Layer, Blockchain, Sociotechnical Design, Trust
Evidence-Based Design of a Knowledge Chatbot: From Interaction Mechanisms to Design Principles
DESRIST 2026 · Part I

Evidence-Based Design of a Knowledge Chatbot: From Interaction Mechanisms to Design Principles

Malte Högemann, István Pál and Oliver Thomas
🎤 Lev Kenning & Lore Koestler
This study investigates how workflow-integrated chatbots can optimize enterprise knowledge retrieval. Employing a design science research methodology, the authors developed a prototype in the finance department of a medium-sized German company and evaluated its usability and performance.

The problem. Employees often waste significant time searching for fragmented internal information or clarifying ambiguous documentation. While large language models and retrieval-augmented generation show promise, organizations struggle to design systems that ensure user trust, data privacy, and seamless day-to-day adoption.

Key findings.
– The chatbot prototype achieved a mean usability score of 68.4 on the Chatbot Usability Questionnaire, placing it slightly above the industry average benchmark of 68. (Supported)
– The prototype demonstrated a high conversational accuracy rate of 88.2% (247 out of 280 interactions) during user testing. (Supported)
– The chatbot's ability to reliably interpret and reason with Excel-based tabular data was mixed, highlighting an ongoing technical limitation of the model. (Mixed)
– The study is limited by its small sample size (15 employees in the field study), single organizational context, and specific language and platform setup. (Limitations)
What it means for you
  • CIO / IT Executive: On Monday morning, mandate a technical audit of your enterprise RAG pipeline's data privacy controls to ensure secure handling of sensitive financial and internal documents before scaling the chatbot pilot.
  • IT Manager: On Monday morning, review your chatbot's data ingestion pipeline and implement a rule to flag or pre-process complex Excel spreadsheets into clean, structured markdown to mitigate the system's performance limitations with tabular data.
  • Business Strategist: On Monday morning, shadow three finance team members for one hour to document their manual search paths for fragmented information, creating a prioritized list of specific documents the chatbot must ingest next to maximize daily time savings.
  • Researcher: On Monday morning, draft a research proposal to replicate the chatbot pilot across three different departments to expand the sample size beyond 15 users and test how different languages and operational setups affect conversational accuracy.
  • Policymaker: On Monday morning, establish an organizational policy requiring all enterprise knowledge chatbots to display mandatory source attribution links for every response to maintain user trust and enable verification of AI-generated advice.
Knowledge Work, Microsoft Copilot, Enterprise Chatbot, Design Science Research, Retrieval-Augmented Generation, Knowledge Management
NeuroAdaptX: Designing Neuro-Adaptive Explanations for Cognitive Accessibility in Explainable AI Interfaces
DESRIST 2026 · Part I

NeuroAdaptX: Designing Neuro-Adaptive Explanations for Cognitive Accessibility in Explainable AI Interfaces

Enrico Bunde
🎤 Liv Knowles & Leo Kant
This study designs and evaluates NeuroAdaptX, an explainable AI interface that adapts presentation-level properties like structure, modality, and density to self-reported neurodiversity profiles. Grounded in Cognitive Load Theory and Cognitive Fit Theory, the author establishes design principles for cognitive accessibility. The system's utility is evaluated through a randomized between-subjects online experiment with 216 participants against a static baseline.

The problem. Standard explainable AI systems often assume a universal user profile, neglecting cognitive diversity such as ADHD, autism spectrum conditions, and dyslexia. This lack of customization can impose excessive cognitive load, degrade comprehension, and distort trust. Prior research has focused on content relevance rather than structural presentation accessibility, leaving a critical gap in design knowledge for inclusive interface adaptation.

Key findings.
– Hypothesis 1 (supported): NeuroAdaptX significantly lowered perceived cognitive load compared to the static baseline interface (p < .001).
– Hypothesis 2 (supported): The adaptive interface led to significantly higher objective comprehension of the AI's explanations (p = .004).
– Hypothesis 3 (supported): Users reported significantly higher explanation satisfaction and quality with the neuro-adaptive design (p < .001).
– Hypothesis 4 (supported): The treatment significantly improved calibrated and appropriate trust compared to the baseline (p < .001).
– Hypothesis 5 (not supported): Exploratory testing showed no statistically significant moderation effect of specific neurodiversity profiles on cognitive load or comprehension, suggesting directionally consistent benefits across ADHD, ASC, and dyslexia groups.
– Hypothesis 6 (supported): The adaptive interface significantly improved perceived informational justice (p = .001).
– Limitations: The study's vignette-based online format abstracts from high-stakes environments and repeated-use dynamics, and profiles were determined via self-report rather than clinical diagnostics.
What it means for you
  • CIO / IT Executive: Update the enterprise IT procurement policy template to mandate that any new AI-driven decision-support software vendors must support customizable presentation-level controls, specifically allowing users to adjust explanation density, format (visual vs. text), and text structure to ensure cognitive accessibility.
  • IT Manager: Direct the lead front-end developer during the Monday morning standup to create a JIRA ticket to replace the current static, text-heavy explanation blocks in our customer-facing AI dashboard with simple user toggles for 'bulleted view' versus 'detailed narrative view' to lower user cognitive load.
  • Business Strategist: Review current client feedback and churn metrics, then write a product pitch for the executive committee proposing 'cognitive accessibility toggles' as a unique, market-differentiating feature for our B2B SaaS platform to improve client retention and user trust metrics.
  • Researcher: Draft a research proposal and IRB application to run a 6-month longitudinal study comparing adaptive AI explanation interfaces against static interfaces using verified clinical diagnostic data (rather than self-reports) in a high-stakes, real-world industry setting.
  • Policymaker: Draft a policy amendment to the agency's Digital Accessibility Guidelines to define AI 'explainability' not just as a matter of algorithmic clarity, but as a requirement for adaptive presentation standards that accommodate cognitive diversity such as ADHD, autism, and dyslexia.
Design Science Research, Explainable AI, Human-AI Interaction, Neurodiversity, Cognitive Accessibility
Towards a Design Theory of Discourse Strategies for Conversational Agents
DESRIST 2026 · Part I

Towards a Design Theory of Discourse Strategies for Conversational Agents

Shirley Gregor, Alexander Maedche and Stefan Morana
🎤 Lev Kenning & Lore Koestler
This study proposes a preliminary design theory for structuring and orchestrating discourse strategies in conversational agents. Utilizing Habermas's Theory of Communicative Action (TCA) as a kernel theory, the authors synthesize existing research to define prescriptive design principles that address cognitive, affective, and behavioral outcomes.

The problem. Prior frameworks for explainable AI and human-agent interaction are siloed and fail to provide an overarching theory of human-machine discourse. Furthermore, older interaction models operate on the naive assumption that agents are inherently honest, failing to address risks such as user manipulation, deception, or emotional persuasion in organizational settings.

Key findings.
– Synthesizes a conceptual design theory identifying five core discourse strategies: instrumental, strategic, expressive, normative, and communicative action.
– Conceptually supports the Instrumental Strategy for achieving high task performance and efficiency while minimizing cognitive load.
– Conceptually supports the Communicative Action Strategy (such as providing transparent explanations) for improving user trust, learning, and overall satisfaction.
– Conceptually supports the Expressive Strategy, grounded in the CASA paradigm, for eliciting emotional trust and perceived social presence through human-like cues.
– Identifies that Strategic (deceptive or covert) and Communicative Action (transparent) strategies are mutually exclusive and cannot be deployed simultaneously.
– Limitation: The proposed principles are based on a conceptual synthesis of prior literature and have not yet been empirically validated in real-world user testing.
What it means for you
  • CIO / IT Executive: Initiate an immediate audit of the organization's active conversational AI agents to categorize them by discourse strategy, mandating the removal of any hidden 'Strategic' (persuasive/covert) prompts in favor of 'Communicative Action' (transparent, explainable) designs to prevent user manipulation.
  • IT Manager: Update the system prompt templates for your primary customer support chatbot to prevent the mixing of 'Strategic' and 'Communicative Action' behaviors, hardcoding an explicit rule that forces the agent to provide verifiable explanations instead of persuasive shortcuts.
  • Business Strategist: Redesign the onboarding flow of your customer-facing virtual assistant to isolate 'Expressive' elements (such as empathetic social cues) to initial rapport-building, while switching to 'Instrumental' strategies during transactional phases to optimize task efficiency.
  • Researcher: Design and launch an empirical user-testing protocol comparing a 'Communicative Action' chatbot variant (using transparent explanations) against an 'Instrumental' variant (using minimal, direct task completion) to measure differences in user trust and cognitive load.
  • Policymaker: Draft a policy amendment for your organization's AI governance framework that legally forbids conversational systems from deploying 'Strategic' (covertly persuasive) discourse strategies, mandating standard disclosures that signal when an agent is acting in a 'Communicative' capacity.
Conversational Agents, Discourse Strategy, Explanations, Design Theory, Theory of Communicative Action, Human-AI Interaction
Designing Hyper-personalized Financial Co-pilots: An Artifact for Building Trust Through Conversational Advice
DESRIST 2026 · Part I

Designing Hyper-personalized Financial Co-pilots: An Artifact for Building Trust Through Conversational Advice

Constantin Brîncoveanu, K. Valerie Carl and Oliver Hinz
🎤 Lev Kenning & Lore Koestler
This study employs a Design Science Research (DSR) approach to develop and evaluate a web-based conversational financial co-pilot. The designed prototype integrates a stateful, session-based memory with a Large Language Model (LLM) to deliver tailored financial advice based on a user's unique goals. The utility and trust-building capacity of the artifact were evaluated through a qualitative focus group with five domain experts.

The problem. Current digital financial services face a major trust deficit due to static robo-advisors that offer limited personalization and ungrounded conversational agents that lack logical constraint enforcement. Consequently, existing digital platforms fail to replicate the competence-based trust that human advisors establish, which hinders their adoption and scalability. This research addresses the challenge of designing conversational interfaces that can demonstrate genuine competence to earn user trust.

Key findings.
– The qualitative evaluation showed that stateful memory can significantly enhance perceived system competence, though its performance proved fragile to evolving user preferences.
– The system's failure to consistently enforce negative constraints (such as recommending excluded assets) was identified as a critical breach of competence that severely damages trust.
– A significant gap was observed between superficial persona-matching and true hyper-personalization, indicating that users judge personalization by the depth of conversational inquiry before a recommendation is made.
– To build sufficient trust in high-stakes financial domains, conversational AI must be paired with institutional trust signals, such as clear liability frameworks, risk disclosures, and graceful management of out-of-boundary requests.
– The study's findings are constrained by a small evaluation sample (n = 5 experts) and its cross-sectional, short-term design, necessitating future quantitative and longitudinal validation.
What it means for you
  • CIO / IT Executive: Convene a meeting with your Lead Enterprise Architect to mandate the integration of a hard-coded constraint-enforcement layer that overrides LLM outputs when negative boundaries (such as excluded asset classes) are violated.
  • IT Manager: Create a ticket in the development backlog to implement a stateful database schema that specifically tracks, updates, and enforces user-defined negative constraints across active conversational chat sessions.
  • Business Strategist: Revise the product roadmap to replace superficial persona-matching onboarding steps with a conversational flow that mandates at least three deep, diagnostic questions to explore user goals before generating any financial recommendations.
  • Researcher: Draft a longitudinal study protocol to test the co-pilot's trust-building efficacy over a three-month period with a cohort of 150 retail investors, specifically measuring trust erosion when user preferences evolve.
  • Policymaker: Draft a regulatory guidance memo requiring conversational financial AI platforms to display explicit, real-time risk disclosures and provide automated graceful degradation prompts when user queries go out-of-bounds.
Hyper-Personalization, Financial Technology, Conversational AI, Trust in Automation, Design Science Research
No Labels, No Problem? Designing an Explainable Unsupervised Anomaly Detection System for Ambient Assisted Living
DESRIST 2026 · Part I

No Labels, No Problem? Designing an Explainable Unsupervised Anomaly Detection System for Ambient Assisted Living

Constantin Brîncoveanu, Leon Ruckes, Aaron Witzki, Tobias Dreesbach and K. Valerie Carl
🎤 Liv Knowles & Leo Kant
This study details the design, development, and evaluation of a web-based support system for anomaly detection in Ambient Assisted Living. The researchers developed a hybrid system utilizing unsupervised machine learning algorithms, specifically Isolation Forest and LSTM-Autoencoder, combined with SHAP-based Explainable AI to interpret alerts. The artifact was evaluated through a quantitative analysis of model agreement on unlabeled sensor data alongside a qualitative focus group with six domain experts.

The problem. Applying machine learning to automate health-related anomaly detection in eldercare is severely limited by a scarcity of labeled training data. Furthermore, existing unsupervised algorithms operate as opaque black boxes, providing alerts without explanation. This lack of interpretability prevents caregivers from establishing the trust required to act on automated alerts.

Key findings.
– Quantitative evaluation supported the implementation of a hybrid model, confirming that Isolation Forest and LSTM-Autoencoder algorithms detect orthogonal, complementary types of anomalies (point outliers versus temporal pattern breaks).\n- Qualitative evaluation supported the existence of a 'Model Selection Paradox,' showing that giving users algorithmic choices and displaying conflicting outputs significantly erodes trust and increases cognitive load.\n- Qualitative evaluation supported the need for contextualized explanations, finding that raw feature importance scores are insufficient for decision-making without semantic baseline comparisons.\n- Focus group feedback supported a user preference for push-based notification interactions rather than continuous pull-based dashboard monitoring.\n- The study's limitations include a small expert focus group sample (n = 6), a short-term evaluation period, and reliance on sensor data from a single provider.
What it means for you
  • CIO / IT Executive: Send an email to your Lead AI Architect to audit all active predictive and monitoring applications, directing them to immediately remove any user-facing model-selection toggles or conflicting algorithmic outputs to prevent the 'Model Selection Paradox' from eroding end-user trust.
  • IT Manager: Task your development team during Monday morning's standup to transition the system's interface away from raw SHAP/feature-importance charts and instead program semantic baseline comparisons (e.g., translating a raw sensor spike into 'Target woke up 2 hours later than their 14-day average'), while shifting the alert delivery from a pull-based dashboard to push-based SMS or mobile notifications.
  • Business Strategist: Meet with the product management team to update the software roadmap, shifting the primary value proposition and marketing copy from 'continuous dashboard tracking' to 'exception-based push alerts,' directly aligning the product's feature set with caregivers' documented preference for passive, push-based workflows.
  • Researcher: Draft a research protocol and partnership proposal to send to a secondary eldercare sensor provider, aiming to initiate a 12-month longitudinal study with a cohort of at least 50 participants to validate the hybrid Isolation Forest/LSTM-Autoencoder model beyond the limitations of the initial 6-person, single-provider pilot.
  • Policymaker: Draft a policy memorandum for the digital health advisory committee recommending that future certification standards for AI-driven eldercare systems mandate the use of contextualized, semantic explanations rather than raw technical feature weights as a baseline requirement for deployment in public care facilities.
Ambient Assisted Living, Unsupervised Learning, Anomaly Detection, Explainable AI, Design Science Research
“Do AI Yourself”: Designing a Toolbox to Empower Small- and Medium-Sized Enterprises to Embrace AI-as-a-Service
DESRIST 2026 · Part I

“Do AI Yourself”: Designing a Toolbox to Empower Small- and Medium-Sized Enterprises to Embrace AI-as-a-Service

Pauline Speckmann, Alexander van der Staay, Mihael Markic, Yngve Kelch, Maximilian Nebel, Jens Poeppelbuss and Christian Janiesch
🎤 Lev Kenning & Lore Koestler
This study details the development and qualitative evaluation of an interactive AI-as-a-Service (AIaaS) toolbox designed to help small- and medium-sized enterprises (SMEs) independently assess, design, and adopt AI initiatives. Applying a design science research approach, the authors developed a set of design requirements, principles, and features, and instantiated them in an interactive prototype.

The problem. SMEs face significant pressure to adopt artificial intelligence to stay competitive, yet they frequently lack the internal expertise and financial resources to execute these projects. This lack of guidance often leads to exploratory, technology-first experiments that fail to align with strategic business needs, resource capacities, or regulatory requirements.

Key findings.
– The conceptual design knowledge (consisting of 4 requirements, 8 design principles, and 21 design features) was qualitatively supported and validated by SME practitioners.\n- Providing concrete, archetypal use cases (DP4) was rated as the most highly relevant design principle, receiving a perfect score of 5 out of 5 from all interviewed SMEs.\n- A design conflict was identified between the linear progress sequence of the prototype and the practitioners' preference for a flexible, non-hierarchical 'pick-and-choose' navigation style, which was addressed in final design adjustments.\n- Results are based on qualitative evaluations with a limited sample of 11 practitioners from 5 SMEs assessing a non-functional prototype rather than a fully deployed system in a live environment.
What it means for you
  • CIO / IT Executive: On Monday morning, halt all custom, ground-up AI development proposals and mandate that all departments map their AI needs to a list of pre-defined, off-the-shelf AI-as-a-Service (AIaaS) archetypes to prevent resource drain and align with strategic business capabilities.
  • IT Manager: On Monday morning, restructure your internal AI exploration portal, replacing any rigid, step-by-step setup wizards with a flexible, modular 'pick-and-choose' menu of approved AIaaS APIs that non-technical team leads can freely browse and test.
  • Business Strategist: On Monday morning, schedule a workshop with department heads to map their top three operational bottlenecks directly to concrete, proven AI use-case archetypes (like automated document classification or basic sentiment analysis) instead of brainstorming abstract, technology-first AI ideas.
  • Researcher: On Monday morning, draft a grant proposal to develop a fully functional, cloud-deployed version of the AIaaS toolbox and establish partnerships with local SME networks to run a quantitative, longitudinal study tracking actual system adoption and usability metrics.
  • Policymaker: On Monday morning, initiate a policy proposal for an 'SME AIaaS Enablement Voucher' program that subsidizes the adoption of pre-vetted, compliant, and modular AI-as-a-Service toolkits to help small businesses safely bypass their internal expertise and budget deficits.
AIaaS, Artificial Intelligence, AI Adoption, DSR, Design Principles, Information Systems Development
Navigating the XAI Forest: Designing a Tree-Based Decision Support Tool for Context-Aware Method Recommendations
DESRIST 2026 · Part I

Navigating the XAI Forest: Designing a Tree-Based Decision Support Tool for Context-Aware Method Recommendations

Sophie Haas, Moritz-Andre Weiher, Malte Högemann and Oliver Thomas
🎤 Liv Knowles & Leo Kant
This study presents the design, development, and evaluation of XAI-MAP, an interactive decision-tree-based tool designed to recommend specific explainable AI (XAI) methods based on use-case characteristics. Utilizing a design science research approach, the researchers gathered requirements from interviews and literature to build the prototype, which was then evaluated via a task-based study with 20 AI developers.

The problem. Selecting appropriate XAI methods for specific applications is challenging for AI developers who lack specialized XAI expertise. Existing selection guidelines are often fragmented, operate only at a coarse method-class level, or assume prior methodological knowledge, leaving a gap for context-sensitive, practical decision support.

Key findings.
– The decision-tree-based prototype was evaluated by 20 developers, yielding a mean System Usability Scale (SUS) score of 87.5, which corresponds to a subjective rating of 'excellent' and indicates high perceived ease of use and low interactional effort.
– Quantitative feedback from the User Experience Questionnaire (UEQ-S) supported the tool's pragmatic quality, with developers rating it as supportive, simple, and structured.
– Users identified semantic friction points regarding domain-specific terminology (such as 'segmentation' and 'end users'), indicating that while the interaction logic was intuitive, terminology clarity requires refinement.
– The study's quantitative results are limited by a small evaluation sample size (n=20) and a simulated, task-based setting, meaning the survey metrics represent descriptive trends rather than statistically generalizable findings.
What it means for you
  • CIO / IT Executive: Issue a directive mandating that all AI development teams adopt a structured, tree-based selection tool like XAI-MAP to standardise and document how explainability methods are chosen for models entering the deployment pipeline.
  • IT Manager: Conduct a 30-minute team alignment meeting to introduce a decision-tree framework for XAI selection, and establish a shared project glossary that clearly defines confusing domain terms like 'segmentation' and 'end users' to eliminate semantic friction.
  • Business Strategist: Create a standardized 'AI Persona and Explanation Profile' for your product managers to complete, defining the exact needs of target 'end users' so developers can input these precise context characteristics into XAI-MAP.
  • Researcher: Draft a research proposal for a longitudinal field study to evaluate the XAI-MAP prototype with a larger sample size of active developers in real-world, non-simulated settings, specifically testing updated terminology to resolve semantic friction.
  • Policymaker: Update draft AI transparency regulations to recommend that compliance audits require organizations to show a documented, context-aware decision pathway (like a tree-based selection process) justifying their chosen model-explanation methods.
Explainable AI, Recommendation Tool, Artifact Instantiation, Decision Support System, Design Science Research
Contextual Orchestration and Reward Shaping Engine (CORSE): Robust Decision-Making in Financial Advising
DESRIST 2026 · Part I

Contextual Orchestration and Reward Shaping Engine (CORSE): Robust Decision-Making in Financial Advising

Ryan Yurosko, Alan Hevner, Wolfgang Jank and Daniel Zantedeschi
🎤 Lev Kenning & Lore Koestler
This study introduces and evaluates the Contextual Orchestration and Reward Shaping Engine (CORSE), a modular decision-support architecture designed to improve the reliability of large language model (LLM) outputs. The authors develop and formalize CORSE's orchestration and reward-shaping mechanisms using an Action Design Research methodology. The system is experimentally validated using LLaMA 3.3 and Claude 4.5 on 50 synthetic financial advisory profiles to evaluate its feasibility and performance.

The problem. Deploying LLMs in high-stakes, regulated environments like financial advising is challenging due to the models' sensitivity to prompt design and inability to consistently incorporate structured constraints. Typical prompting techniques or post-hoc evaluation approaches fail to prevent fluent yet materially incorrect or non-compliant outputs. There is a lack of closed-loop architectures that dynamically route prompts and enforce explicit domain-specific rules during inference.

Key findings.
– Under controlled simulation conditions, the full CORSE architecture (combining contextual orchestration and reward-shaping evaluation) achieved statistically significant improvements in pass rates and mean composite reward scores over both baseline and orchestration-only configurations across Claude 4.5 and LLaMA 3.3 models (all p-values = 0.0000).
– The performance difference between the Baseline and Contextual Orchestration-only (CO-only) configurations was not statistically significant (p = 0.2967 for Claude 4.5 pass rate, p = 0.2179 for LLaMA 3.3 pass rate), indicating that adaptive reward shaping is the primary driver of recommendation quality.
– CORSE enabled rapid error correction, with an average of 0.84 iterations required to resolve failure modes per client profile.
– The study's findings are limited by its reliance on a controlled synthetic financial-planning setting and prototype metrics, which may not fully capture the complexities of live, institutional advisory environments.
What it means for you
  • CIO / IT Executive: Review the current generative AI roadmap and mandate that all teams deploying LLMs in regulated or high-stakes environments halt plans that rely solely on advanced prompting or semantic routing (orchestration), shifting budget and engineering resources instead to build closed-loop 'reward shaping' validation pipelines that programmatically evaluate and iteratively correct LLM outputs prior to delivery.
  • IT Manager: Set up a Jira task to implement an automated verification and self-correction loop in your LLM application: configure a Python validator block that checks model outputs against structured business rules and, upon a failure, programmatically routes the error payload back to the LLM with a correction prompt, limiting the retry loop to 2 iterations.
  • Business Strategist: Identify one high-value, previously blocked automated advisory or customer-facing initiative and draft a pilot proposal to launch it, leveraging the CORSE finding that an active reward-shaping compliance layer can reliably neutralize LLM hallucination and compliance risks in financial workflows.
  • Researcher: Draft a research proposal to extend the CORSE framework by partnering with an institutional financial firm to test the reward-shaping architecture against a live, anonymized dataset of real-world client profiles, moving beyond synthetic evaluations to measure its real-world error-correction latency and edge-case performance.
  • Policymaker: Draft an amendment to the internal AI governance guidelines stating that simple prompt engineering or static system-level instructions are insufficient for compliance in fiduciary or advisory LLM applications, and require developers to prove the implementation of active, closed-loop automated correction mechanisms.
Large Language Models, Design Science Research, Decision Support Systems, Prompt Engineering, Reward Shaping, Financial AI, CORSE Architecture
Designing Assistive Technologies for Blind and Visually Impaired: Problem Understanding and Design Objectives
DESRIST 2026 · Part I

Designing Assistive Technologies for Blind and Visually Impaired: Problem Understanding and Design Objectives

Jan Laufer, Leonardo Banh, Thorsten Schoormann and Gero Strobel
🎤 Liv Knowles & Leo Kant
This study aims to ground the design of assistive technologies for blind and visually impaired people through empirically validated problem understanding and actionable guidance. Adopting an echeloned Design Science Research approach, the authors conducted 25 semi-structured interviews with visually impaired individuals to identify core challenges and define system requirements.

The problem. Many existing assistive technologies are built for mass markets and fail to align with the context-sensitive barriers and everyday practices of visually impaired individuals. Additionally, current research lacks a comprehensive understanding of how emerging artificial intelligence tools can be designed to address these complex daily challenges without causing user strain.

Key findings.
– Seven distinct problem areas (interaction with surroundings, assistive tools, daily life, well-being, public interaction, external dependencies, and financial aspects) were empirically identified and validated through qualitative coding.
– The research derived two major design objectives (Managing Interaction with the Surroundings and Managing Assistive Tools) and formulated thirteen concrete design requirements to guide future developers.
– Because this is an initial design science and qualitative analysis study, findings are qualitatively supported by grounded theory methodology rather than statistical hypothesis testing, with recognized limitations including a German-specific context and a sample potentially biased toward higher technology affinity.
What it means for you
  • CIO / IT Executive: Audit your organization's internal software procurement policy to mandate that any incoming AI or enterprise software vendors demonstrate how their tools align with the study's 13 concrete design requirements, specifically focusing on minimizing user strain and integration with existing assistive tools for visually impaired employees.
  • IT Manager: Gather your software development and UX design teams for a 90-minute backlog refinement session to map your product's current user flows against the 7 identified problem areas, specifically prioritizing features that reduce external dependencies and improve interaction with surroundings.
  • Business Strategist: Review the product roadmap and draft a business proposal to pivot from a 'mass-market' one-size-fits-all design toward a modular, context-sensitive AI customization model that targets the daily life and public interaction challenges identified in the research.
  • Researcher: Draft a research proposal and participant recruitment screener for a quantitative follow-up study designed to validate the 13 design requirements in a non-German context, specifically targeting participants with low-to-moderate technology affinity to counter the original study's bias.
  • Policymaker: Initiate a review of public grant funding criteria for assistive technologies to require that applicant projects demonstrate compliance with the study's two major design objectives (Managing Interaction with the Surroundings and Managing Assistive Tools) to secure funding.
AI-driven Assistive Technology, Blind and Visually Impaired People, Design Knowledge, Echeloned Design Science Research, Design Science Research
When Shortest Isn't Safest: Senior-Friendly Pedestrian Routing
DESRIST 2026 · Part I

When Shortest Isn't Safest: Senior-Friendly Pedestrian Routing

Erdi Ünal, Daniel Eisenhardt, Christian Meske, Seyed Nima Afzali and Aysegül Dogangün
🎤 Lev Kenning & Lore Koestler
This study utilizes Design Science Research to develop and evaluate a senior-friendly pedestrian routing artifact that integrates barrier avoidance and supportive amenities. The routing engine was implemented using open-source street and elevation data, and subsequently evaluated with older adults. The methodology includes a preliminary test and a field-based walking comparison against a mainstream navigation baseline.

The problem. Standard pedestrian navigation tools optimize primarily for distance or travel time, ignoring physical barriers, safety considerations, and supportive amenities critical to older adults. This misalignment can restrict late-life mobility, leading to decreased social participation and increased isolation. There is a lack of empirical design knowledge to guide practitioners in building senior-friendly, explainable routing applications.

Key findings.
– In a descriptive comparative preference test, 64% of participants (9 out of 14) preferred the senior-friendly route over the Google Maps baseline, while 29% reported no preference.
– Descriptive mean ratings favored the senior-friendly route across all dimensions, including perceived safety (+17.35%), comfort (+12.24%), physical load (+10.20%), and suitability for walking alone (+7.14%).
– Due to the small sample size (n=14) and descriptive nature of the final evaluation, formal statistical significance testing was not conducted on the quantitative route comparisons.
– Qualitative interviews validated that route choices are highly situational, depending heavily on dynamic context-aware factors (weather, lighting, and season) and social conditions (walking alone versus accompanied).
– The study is limited by its small sample size, potential peer influence during group-based walking evaluations, and geographic confinement to a single mid-sized German city.
What it means for you
  • CIO / IT Executive: Direct your enterprise mapping and GIS architects to audit your current navigation data pipeline and draft an integration plan to ingest open-source digital elevation models (DEM) and pedestrian barrier data into your core routing infrastructure.
  • IT Manager: Task your development team to spin up a test instance of an open-source routing engine (like Valhalla or OSRM) and configure a custom routing profile that heavily penalizes steep inclines, stairs, and unlit pathways.
  • Business Strategist: Create a product pitch for a 'Safe & Accessible' routing toggle in your application, utilizing the research's findings of a 17% increase in perceived safety and 12% increase in comfort to target the high-growth 'silver economy' demographic.
  • Researcher: Draft a research protocol for a multi-city field study with a target sample size of at least 100 older adults, incorporating dynamic contextual variables like weather and time-of-day to address the statistical and geographic limitations of the original study.
  • Policymaker: Direct the municipal GIS department to launch a public crowdsourcing campaign to map micro-barrier data (such as missing curb cuts, sidewalk quality, and resting benches) onto OpenStreetMap to enable local accessibility routing applications.
Accessibility, Senior Mobility, Pedestrian Routing, Navigation Systems, Design Science Research
From Human-in-the-Loop to Human-on-the-Loop: VALID Framework for Resilient Trust in Autonomous AI Agent Contract Execution
DESRIST 2026 · Part I

From Human-in-the-Loop to Human-on-the-Loop: VALID Framework for Resilient Trust in Autonomous AI Agent Contract Execution

Jonathan Ocasio and Matthew Mullarkey
🎤 Lev Kenning & Lore Koestler
This study introduces the Verified Adaptive Learning in Deployment (VALID) framework, a five-layer governance architecture designed to support the transition from human-in-the-loop oversight to human-on-the-loop supervision in autonomous contract execution contexts. Using Elaborated Action Design Research (eADR), the framework was developed and iteratively refined through three design cycles. Its performance was evaluated using a multi-AI agent simulation across 150 matched runs against a simulated human baseline incorporating modeled cognitive biases.

The problem. While organizations increasingly delegate complex tasks to autonomous AI agents, existing governance frameworks provide limited guidance on how these agents can earn and sustain interorganizational trust. There is a lack of architectures that support continuous, performance-based agent learning while ensuring strict compliance with constitutional constraints. Consequently, organizations struggle to transition safely from direct human oversight to high-level supervisory roles.

Key findings.
– The VALID framework successfully supported trust progression, with agents demonstrating a statistically significant increase in trust from a neutral baseline of 0.50 to a mean of 0.805 across 150 simulated runs.
– The hypothesis that governed agents would maintain high decision quality and avoid hesitation was supported, with agents achieving a stable 48.9% win rate and 0% avoidance behavior, compared to a 47.1% avoidance rate in the simulated human baseline.
– The initial assumption that higher trust scores would naturally produce greater operational sustainability was not supported; high-trust agents (trust >= 0.85) achieved only 53.8% sustainability compared to 70.0% for lower-trust agents because they accepted commitments beyond organizational delivery capacity.
– A primary limitation is that all findings are based on modeled computational outcomes in a controlled simulation under ideal technical conditions rather than empirically observed real-world behavior.
What it means for you
  • CIO / IT Executive: Review your active AI deployment roadmap and issue a directive requiring all development teams to implement a five-layer governance architecture with constitutional constraints before transitioning any agent from direct human-in-the-loop approval to human-on-the-loop supervisory control.
  • IT Manager: Open your active AI contract-execution agent configuration dashboards and hardcode strict operational capacity limits into the decision-making rules to ensure high-trust agents cannot automatically accept commitments that exceed your team's physical delivery bandwidth.
  • Business Strategist: Audit the KPIs used to evaluate autonomous negotiation agents and mandate that trust progression metrics (like contract win rates) are strictly balanced against real-time operational capacity parameters to prevent unsustainable over-commitment.
  • Researcher: Draft a research proposal for a controlled, real-world pilot study to test the VALID framework's simulation findings, specifically evaluating if real human-on-the-loop supervisors experience the same over-commitment vulnerabilities observed in the computational model.
  • Policymaker: Draft an update to your agency's AI governance standards that mandates 'human-on-the-loop' override capabilities and strict capacity-safeguard thresholds for any autonomous agents executing legally binding public contracts.
Design Science Research, Resilient Trust, Autonomous AI Agent, Multi-Agent Collaboration, Elaborated Action Design Research, Ethical AI Governance
Designing a Deployable ML Development Framework that Operationalizes Trustworthy Predictive Applications in the Medical Field
DESRIST 2026 · Part I

Designing a Deployable ML Development Framework that Operationalizes Trustworthy Predictive Applications in the Medical Field

Arindam Brahma, Kala Seal, Yasaman Ghasemi and Samir Chatterjee
🎤 Liv Knowles & Leo Kant
This study presents and validates a socio-technical machine learning development framework called UML-Med designed to build trustworthy and deployable clinical prediction applications. The researchers used Design Science Research Methodology (DSRM) to structure the framework and validated its practical application by building a patient no-show prediction model using five years of electronic health record data (7.9 million appointments) from a pediatric hospital.

The problem. Despite promising predictive performance, machine learning models face limited adoption in real-world clinical settings due to a trust gap caused by prediction uncertainty, lack of explainability, and limited methodological reproducibility. Traditional model evaluations rely on static performance metrics like accuracy or area under the curve (AUC) without assessing model stability and predictive variance under realistic data variations.

Key findings.
– Incorporating uncertainty quantification alongside traditional metrics enabled the selection of a more stable and reliable model (AdaBoost) that had a lower prediction uncertainty (mean spread of 0.0073) and stable cross-validation performance. (Supported)
– Evaluating models purely on predictive accuracy (AUC) was shown to be insufficient for clinical deployment, as the highest-performing models (Bayesian Neural Networks and Gradient Boosting) exhibited greater predictive variance and uncertainty. (Supported)
– Model-agnostic explanations (SHAP) demonstrated that patient behavioral history and scheduling details (such as past cancellations, rescheduling frequency, and late-day appointment hours) are more powerful predictors of no-shows than demographic variables. (Supported)
– Limitation: The framework's validation was restricted to historical electronic health records from a single pediatric ambulatory clinic, meaning model performance and feature weights may drift or differ across different geographic settings and clinical populations.
What it means for you
  • CIO / IT Executive: Draft an executive mandate requiring all upcoming clinical ML project proposals to submit uncertainty quantification (UQ) metrics and prediction variance analyses alongside standard accuracy/AUC scores before receiving deployment approval.
  • IT Manager: Update the data pipeline requirements for the patient scheduling system to integrate SHAP-based feature importance tracking, prioritizing patient behavioral history metrics (such as past cancellations and reschedule rates) over demographic data.
  • Business Strategist: Initiate a pilot program to redesign the afternoon appointment scheduling workflow, targeting patients with a history of cancellations with automated, high-touch SMS reminders sent specifically for late-day time slots.
  • Researcher: Draft a multi-site validation protocol to test the UML-Med framework using EHR data from an external clinic or a different geographic region to evaluate model drift and feature weight stability.
  • Policymaker: Begin drafting a policy memo proposing new clinical AI certification standards that require developers to submit documented uncertainty quantification (UQ) and model-agnostic explainability (SHAP) reports prior to clinical software procurement.
Patient No-show Prediction, Uncertainty Quantification, Explainable AI, SHAP, Pediatric Ambulatory Care, Design Science Research
A Design Taxonomy for GUI-Based Test Automation
DESRIST 2026 · Part I

A Design Taxonomy for GUI-Based Test Automation

Enes Kara, Bijan Khosrawi-Rad and Frederik Möller
🎤 Liv Knowles & Leo Kant
This study utilizes design science research to conceptualize and map the design space of graphical user interface (GUI) test automation. The authors analyze 103 GUI-based test automation frameworks to build a structured, multi-dimensional taxonomy. This taxonomy is integrated with a cluster analysis to identify and describe recurring design patterns, or archetypes, in test automation.

The problem. The current landscape of GUI-based test automation is fragmented and diverse, ranging from deterministic manual approaches to autonomous AI solutions. Practitioners struggle to make informed decisions because the underlying design choices of various tools and frameworks remain implicit and lack a structured comparison framework. This leads to tool-centric selection rather than strategic, design-aligned configuration.

Key findings.
– Developed a taxonomy containing four meta-dimensions (context, test strategy, system view, execution) and nine design dimensions to classify GUI test automation.
– Identified five distinct design archetypes (human-guided, deterministic, autonomous learning-based, rule-based, and learning-guided) through a k-modes cluster analysis of 103 frameworks.
– Evaluated the taxonomy's practical applicability by mapping it to five established real-world tools, demonstrating how different design choices align with specific testing goals.
– This design science research study did not conduct empirical statistical hypothesis testing; the identified archetypes are descriptive and require further empirical validation with active practitioners in real-world settings.
What it means for you
  • CIO / IT Executive: Instruct your Enterprise Architecture team to draft a new IT procurement policy requiring all future GUI test automation software purchases to be evaluated against the study's four meta-dimensions (context, test strategy, system view, execution) rather than relying on vendor feature lists, preventing redundant tool lock-in.
  • IT Manager: Gather your QA leads for a 90-minute workshop to map your department's current testing tool stack against the five design archetypes (e.g., deterministic vs. learning-guided) to identify overlapping tool capabilities, eliminate redundant license costs, and find gaps in autonomous testing.
  • Business Strategist: Review your product portfolio's time-to-market bottlenecks and align product risk-tolerances with the study's archetypes, allocating budget to transition fast-paced, low-risk customer-facing interfaces to 'autonomous learning-based' testing to accelerate release cycles.
  • Researcher: Draft a research grant proposal and contact two industry partners to establish a collaborative field study that empirically measures and compares the actual maintenance overhead and bug-detection rates of 'deterministic' vs. 'learning-guided' GUI testing archetypes in live production environments.
  • Policymaker: Revise your organization's internal Software Quality Assurance (SQA) compliance standards to mandate that all internal development teams classify and document their test automation suites using the study's 9-dimension design taxonomy to ensure long-term systems maintainability.
GUI-based test automation, testing strategy, design taxonomy, design archetype, test automation framework, design space
Supporting Information Seeking in Software Development: A Design Theory for Local Retrieval-Augmented Generation Across Distributed Knowledge Sources
DESRIST 2026 · Part I

Supporting Information Seeking in Software Development: A Design Theory for Local Retrieval-Augmented Generation Across Distributed Knowledge Sources

Flora Horn and Martin Wiener
🎤 Lev Kenning & Lore Koestler
This study utilizes a Design Science Research approach to develop and evaluate a design theory for cross-source information seeking in software development. The authors design and instantiate a local-first, modular Retrieval-Augmented Generation (RAG) prototype named SPOCK that operates on commodity hardware. The system's feasibility, usability, and performance are evaluated through formative benchmarks and a naturalistic field study in a small-to-medium enterprise (SME).

The problem. Software developers spend nearly a third of their workday searching and integrating fragmented information across heterogeneous sources, leading to high cognitive overhead and reduced efficiency. While cloud-hosted Retrieval-Augmented Generation (RAG) offers a solution, privacy concerns and high infrastructure costs of Large Language Models (LLMs) prevent adoption in resource-constrained small and medium-sized enterprises (SMEs). Conversely, using compact, locally-deployed Small Language Models (SLMs) introduces high risks of hallucination and model confusion when processing noisy, multi-source contexts.

Key findings.
– Formative benchmarks empirically demonstrated that a hybrid retrieval strategy combining keyword scoring with vector-based semantic similarity achieves a more balanced search performance (overall Mean Reciprocal Rank of 0.67 and Precision@10 of 0.79) than keyword-only or embedding-only approaches in isolation.
– Field evaluation confirmed that a centralized knowledge index (DP-X1) reduces developers' search effort and context-switching overhead compared to traditional fragmented search workflows.
– The deployment of evidence-driven traceability features (DP-RS1), such as collapsible 'evidence chips' linked to original documents, was shown to increase developer trust and facilitate rapid technical verification of generated outputs.
– System administrators confirmed the feasibility of local-first, containerized Small Language Model (SLM) deployment (DP-GE1) for meeting data privacy compliance on standard hardware (8-16 GB RAM), though latency was identified as a potential adoption risk.
– Acknowledged limitations of the study include its evaluation within a single SME, which may limit immediate transferability to other organizational environments, and the limited generative depth inherent to compact SLMs.
What it means for you
  • CIO / IT Executive: Instruct your infrastructure team to deploy a containerized local Small Language Model (SLM) on standard 8-16 GB RAM commodity hardware to run a proof-of-concept for offline, local-first RAG, eliminating cloud subscription costs and data privacy compliance risks.
  • IT Manager: Update the development backlog to implement a hybrid search strategy in your internal developer portal, combining traditional keyword scoring with vector-based semantic similarity, and mandate that the UI displays collapsible 'evidence chips' linked to source files to allow rapid verification of search results.
  • Business Strategist: Draft a business case and project charter to consolidate the company's fragmented engineering documentation into a centralized knowledge index (DP-X1), targeting a reduction in the 30% of workday time developers currently waste on information seeking.
  • Researcher: Draft a research proposal to address the study's generalizability and technical limitations by designing a multi-firm comparative study that tests local-first RAG frameworks, focusing specifically on optimizing latency and expanding the generative depth of compact SLMs.
  • Policymaker: Incorporate 'local-first, containerized AI deployment on standard hardware' into your department's tech-adoption grant guidelines and compliance advice as a recommended, low-cost path for SMEs to adopt AI without risking IP leaks or violating strict data privacy regulations.
GenAI, RAG, SLM, Design Science Research, Information Foraging Theory, Software Development
Making the Implicit Explicit: A Human-In-The-Loop AI Pipeline for Excavating and Making Use of Latent Design Knowledge
DESRIST 2026 · Part I

Making the Implicit Explicit: A Human-In-The-Loop AI Pipeline for Excavating and Making Use of Latent Design Knowledge

Timo Strohmann, Linda Sagnier Eckert, Daniel Heinz, Thorsten Schoormann, Christoph Hoppe-Ludwig and Anne Ixmeier
🎤 Liv Knowles & Leo Kant
This study introduces AIDE, a human-in-the-loop, AI-assisted pipeline designed to systematically extract and formalize latent design knowledge from non-design science publications. The authors demonstrate its feasibility by applying it to a corpus of 99 interdisciplinary papers focusing on AI-enabled circular economy strategies.

The problem. Valuable design insights in applied fields like engineering and computer science often remain embedded implicitly in technical descriptions and evaluations, rather than being expressed as reusable design knowledge. Extracting these insights manually is highly resource-intensive and lacks a systematic, scalable methodology.

Key findings.
– The study successfully demonstrated the feasibility of the AIDE pipeline by extracting and formalizing one expert-validated design principle per paper from a corpus of 99 peer-reviewed publications.
– The pipeline processed all 99 publications in approximately 1.5 hours using Gemini Flash 2.5, establishing a balance between extraction quality and computational efficiency.
– The effectiveness of the pipeline is limited by the richness and clarity of the source text, as well as current large language model constraints regarding context windows and inference stability.
What it means for you
  • CIO / IT Executive: Approve a pilot project to deploy a lightweight LLM pipeline (like Gemini Flash) to scan your organization's unstructured legacy technical wikis and project post-mortems, automatically converting them into a structured, searchable catalog of engineering design patterns.
  • IT Manager: Write a Python script using the Gemini API to parse the technical documentation of your team's last 50 software releases, and schedule a 1-hour Monday afternoon meeting with senior developers to review and validate the extracted system-design rules.
  • Business Strategist: Download 100 public patent filings or technical white papers from your top three competitors, and use an LLM prompt pipeline to extract and map their hidden circular-economy and manufacturing design strategies to find market gaps.
  • Researcher: Batch-download 50 academic papers relevant to your current literature review, input them into an LLM with a structured prompt to extract 'context, mechanism, and design outcomes' from each, and synthesize the results into a taxonomy matrix.
  • Policymaker: Draft an amendment to your department's public research grant guidelines that mandates all future technical and scientific awardees submit a structured 'explicit design principles' metadata file alongside their final reports to enable automated knowledge indexing.
Design Grounding, Design Knowledge, Latent Design Knowledge, Human-AI Collaboration, Computational Design Science Research
Design Beyond the Problem Space: Anticipatory Design Science
DESRIST 2026 · Part I

Design Beyond the Problem Space: Anticipatory Design Science

Dirk Hovorka
🎤 Liv Knowles & Leo Kant
This study proposes a theoretical framework for anticipatory design science to complement traditional Design Science Research (DSR). It introduces strategies for temporal expansion to help researchers design technology while proactively imagining its long-term sociotechnical consequences.

The problem. Traditional design science is often limited by its focus on immediate, present-day problems, which temporally localizes solutions and ignores future social, ethical, and environmental impacts. Without tools to speculate about future worlds, designers frequently project current assumptions onto future users, potentially causing unintended negative consequences when technologies become mundane.

Key findings.
– Proposed three conceptual strategies for temporal expansion in design: problem problematization, designing speculative social worlds, and planning for evolving systems rather than final goals (Theoretical proposition; no empirical or statistical hypotheses were tested).
– Argued that designing with speculative 'possible worlds' helps expose hidden assumptions and potential negative externalities of technological systems before they are deployed (Conceptual analysis; no empirical or statistical hypotheses were tested).
– Recommended a shift in design evaluation criteria toward metrics like criticality, interestingness, and fruitfulness to better judge future-oriented designs (Conceptual recommendation; no empirical or statistical hypotheses were tested).
– Acknowledged limitations in that this is a conceptual and theoretical paper without empirical validation or statistical testing of the proposed design strategies in real-world organizations.
What it means for you
  • CIO / IT Executive: Update your enterprise IT project charter template to include a mandatory 'Speculative Pre-Mortem' section, requiring project sponsors to outline one major unintended sociotechnical risk if the proposed system achieves 100% user adoption, and define a monitoring metric for it before funding is approved.
  • IT Manager: In your team's sprint planning meeting on Monday, introduce a 'System Evolution' checklist item to your team's 'Definition of Done' (DoD), requiring developers to document how a newly designed software architecture can adapt to two hypothetical future data privacy shifts without requiring a complete rewrite.
  • Business Strategist: Schedule a 90-minute 'Speculative Worldbuilding' workshop with your product design team for this week, using the prompt: 'Imagine a future where our service is free and legally mandated for all citizens; map out three negative externalities this would create for our users and draft preventative business model pivots for each.'
  • Researcher: Revise the evaluation section of your current research manuscript or grant proposal to replace traditional, static utility metrics with a qualitative evaluation rubric that explicitly measures 'criticality' (how the prototype challenges current industry paradigms) and 'fruitfulness' (the quantity of new research questions the design generates).
  • Policymaker: Draft a memo to establish an agile 'Anticipatory Technology Panel' within your department, tasking them with conducting quarterly simulations of emerging tech trends (e.g., generative UI) to draft proactive regulatory guardrails before those technologies reach mass-market scale.
Design Science, futures, metaphor, speculation, anticipatory design
Design Knowledge for Ethical Use of AI in Emergency Medical Dispatch Systems
DESRIST 2026 · Part I

Design Knowledge for Ethical Use of AI in Emergency Medical Dispatch Systems

Lisa Dasmann, Alexander van der Staay and Christian Janiesch
🎤 Lev Kenning & Lore Koestler
This study employs a design science research approach to establish prescriptive design knowledge for the ethical integration of artificial intelligence within emergency medical dispatch systems. The methodology combines a systematic literature review of 34 papers with qualitative analysis of eight semi-structured interviews with experts and laypeople. The resulting design requirements and principles were evaluated and refined through an online focal group workshop with four domain experts.

The problem. The implementation of artificial intelligence in safety-critical, high-stress emergency dispatch environments raises complex ethical challenges, such as algorithmic bias, opaque decision-making, and the potential erosion of human accountability. While technological capability is rapidly advancing, there is a lack of concrete, practice-oriented design guidelines that translate abstract ethical concepts into actionable requirements for developers and practitioners. This gap risks a premature and unreflective shift of decision authority from human dispatchers to automated systems.

Key findings.
– The study successfully formulated nine distinct design requirements and six actionable design principles to guide the ethical development and deployment of AI in emergency dispatch systems.
– The proposed design principles were conceptually validated and supported by a consensus of domain experts during an ex-ante evaluation workshop, though traditional statistical hypothesis testing was not conducted due to the qualitative design science methodology.
– Key principles advocate for positioning AI strictly as non-binding decision support, using adaptive and purpose-bound data collection, incorporating medically validated gender-specific symptom profiles, and providing low-threshold, cue-based explanations.
– The study's findings are limited by its ex-ante nature, meaning the principles were evaluated conceptually rather than through the ex-post observation of a fully implemented system in a live, real-world operational environment.
What it means for you
  • CIO / IT Executive: Draft a memorandum to the procurement and software development teams mandating that all future AI integration contracts for emergency dispatch systems must strictly define the AI's role as non-binding decision support, stripping any automated-dispatch capabilities from the system architecture.
  • IT Manager: Audit the current dispatch system's user interface design and draft a development ticket to implement 'cue-based explanations', such as color-coding and highlighting specific keywords in the caller transcript, so dispatchers immediately see the exact medical reasoning behind an AI-generated risk assessment.
  • Business Strategist: Create an updated training protocol and change-management handbook for dispatchers that clearly defines human-in-the-loop accountability, outlining specific procedures for when and how dispatchers should confidently override AI suggestions.
  • Researcher: Draft a research proposal and ethics board submission for an ex-post observational study in a partner dispatch center to measure dispatcher override rates, decision latency, and accuracy when presented with the study's proposed cue-based explanations in a live, high-stress environment.
  • Policymaker: Draft an amendment to the regional Emergency Medical Services (EMS) regulatory framework that legally requires dispatch agencies utilizing triage AI to prove their software incorporates medically validated, gender-specific symptom profiles to mitigate algorithmic bias in cardiac and stroke diagnostics.
Emergency Medicine, Prehospital Care, Artificial Intelligence, AI Ethics, Design Principles, Design Science Research
What is a Problem? Criteria and Guidelines for Crafting Research Problems in Design Science
DESRIST 2026 · Part I

What is a Problem? Criteria and Guidelines for Crafting Research Problems in Design Science

Vincent Beermann, Thomas Haskamp and Jan Vom Brocke
🎤 Liv Knowles & Leo Kant
This study develops a conceptual framework designed to guide researchers in identifying, positioning, evaluating, and formulating research problems in Design Science Research (DSR). Drawing on both design science and design thinking methodologies, the authors propose a workflow comprising the Problem Funnel, the Problem Space Matrix, seven explicit quality criteria, and five actionable design guidelines. The utility of the framework is demonstrated through two illustrative application examples: sustainable behavior in residential buildings and enterprise systems migration.

The problem. While identifying a research problem is recognized as the essential first step in Design Science Research, existing methodologies offer very limited operational guidance on what constitutes a high-quality problem. Consequently, researchers frequently struggle with either overly abstract 'grand challenges' that are intractable in a single project, or overly specific 'technical problems' that fail to generate generalizable, theory-driven knowledge. This lack of systematic evaluation criteria leads to poorly formulated problem statements and potentially irrelevant design artifacts.

Key findings.
– Developed the 'Problem Funnel' model to conceptually structure how broad, abstract challenges are progressively decomposed into research-tractable problems.
– Developed the 'Problem Space Matrix' to position problems along two dimensions (scope and structure), identifying the sweet spot for Design Science Research as focused yet complex problems.
– Proposed seven explicit, literature-grounded criteria for evaluating research problem quality: appropriate specificity, empirical grounding, stakeholder significance, tractable solvability, generalizable scope, solution novelty, and articulable clarity.
– Formulated five practical guidelines for problem identification and development to help researchers systematically navigate and refine their problem statements.
– Identified several common pitfalls in problem formulation, such as the Scope Trap, Solution-First Thinking, Gap-Spotting Without Problematization, and the Novelty Trap.
– Acknowledged that a key limitation of the proposed framework is the lack of solid empirical validation beyond the five seminar projects and two domain illustrations used to demonstrate its applicability.
What it means for you
  • CIO / IT Executive: On Monday morning, halt the procurement process of your top three active IT initiatives and mandate that each project lead present a one-page assessment proving they haven't fallen into 'Solution-First Thinking', requiring them to explicitly state the empirical business problem before any software licenses are purchased.
  • IT Manager: On Monday morning, gather your systems migration or development team for a 1-hour workshop to map your current, stalled technical bottlenecks onto the 'Problem Funnel' model, systematically decomposing your broad system challenges into highly specific, tractable, and assignable technical tasks.
  • Business Strategist: On Monday morning, audit your current pipeline of innovation initiatives against the 'Novelty Trap' and 'Gap-Spotting' pitfalls by scoring each initiative against the paper’s 'stakeholder significance' and 'solution novelty' criteria to weed out projects that chase market trends without solving real customer pain points.
  • Researcher: On Monday morning, plot your current research draft on the 'Problem Space Matrix' to verify if your topic sits in the 'focused yet complex' sweet spot, and rewrite your problem statement to explicitly address two of the seven quality criteria: 'appropriate specificity' and 'generalizable scope'.
  • Policymaker: On Monday morning, update the assessment rubric for your department's upcoming public funding calls (such as green building initiatives) to incorporate the 'tractable solvability' and 'empirical grounding' criteria, ensuring public funds are not wasted on overly abstract 'grand challenges' that lack practical execution paths.
Design Science Research, Problem Formulation, Wicked Problems, Research Methodology, Problem Space, Design Thinking
The Medium is the Prompt: Prompts as Design Science Artifacts
DESRIST 2026 · Part I

The Medium is the Prompt: Prompts as Design Science Artifacts

Konrad Schulte, Savindu Herath, Frederik Möller and Christine Legner
🎤 Liv Knowles & Leo Kant
This study conceptualizes prompts as Design Science Research (DSR) artifacts to integrate them into systematic socio-technical development and evaluation methodologies. The authors outline a two-level conceptualization, task support and prompt specification, to guide prompt engineering and research in non-deterministic large language model environments.

The problem. In existing Design Science Research, prompts are frequently treated as minor, static implementation details rather than intentionally designed components, leading to a lack of transparency and poor evaluation rigor. This under-specification exacerbates the gaps between stakeholder intentions and final model outputs, especially given the inherently non-deterministic nature of generative AI.

Key findings.
– The paper logically establishes that prompts meet all core criteria of design science artifacts, meaning they are human-made, purposeful, utility-based, evaluable, socio-technically embedded, and technology-oriented.
– The authors proposed a two-level conceptual framework comprising task support (defining expectations, boundaries, and context) and prompt specification (encoding structured instruction techniques like chain-of-thought or few-shot prompting).
– The study delineates structured guidelines for prompt evaluation, emphasizing that because LLM behavior is non-deterministic, evaluation must rely on repeated executions, variance reporting, and failure-mode analysis rather than one-off tests.
– As a purely conceptual and argumentative paper, no empirical hypotheses were formulated or statistically tested, which represents a limitation that future empirical studies must address.
What it means for you
  • CIO / IT Executive: On Monday morning, establish a centralized, version-controlled repository (such as a Git-based Prompt Registry) and mandate that all generative AI prompts used in production are treated as formal software assets, requiring documented ownership, change logs, and alignment with corporate IT governance.
  • IT Manager: On Monday morning, update your team's QA workflow to prohibit 'one-and-done' manual prompt testing; mandate a testing protocol where every modified prompt must be run automatically at least 50 times to record output variance and map failure modes before deployment.
  • Business Strategist: On Monday morning, audit your current AI initiatives and write a structured 'Task Support' brief for the development team that explicitly defines the business context, operational boundaries, and acceptable guardrails for the model's outputs to bridge the gap between business intent and model execution.
  • Researcher: On Monday morning, design an empirical experiment to address the paper's core limitation by setting up an A/B test that quantitatively measures whether prompts structured using the two-level framework (Task Support + Prompt Specification) yield lower output variance and fewer failures than standard baseline prompts.
  • Policymaker: On Monday morning, draft a clause for your organization's AI Procurement Policy requiring all third-party generative AI vendors to provide documented prompt specifications and multi-run evaluation reports (including variance and failure-mode analysis) to ensure socio-technical transparency.
Prompts, Artifacts, Design Science, Large Language Models, Generative Artificial Intelligence
Healthcare Ecosystem Orchestration: An Action Design Research Study for Cross-Level Patient Navigation
DESRIST 2026 · Part I

Healthcare Ecosystem Orchestration: An Action Design Research Study for Cross-Level Patient Navigation

Florian Bontrup, Christoph Kollwitz, Laura Biller, Hendrik Wache and Sarah Hönigsberg
🎤 Liv Knowles & Leo Kant
This study employs an eight-year Action Design Research (ADR) methodology to design, build, and evaluate a Healthcare Ecosystem Orchestration (HEO) artifact. The HEO artifact coordinates cross-level patient navigation across supra-regional, regional, and local healthcare environments. Its utility is evaluated using a naturalistic, ex post assessment of 48,516 real-world navigation cases.

The problem. Industrialized healthcare systems are highly fragmented, leading to significant inefficiencies and resource misallocation as patients navigate autonomous organizations. Traditional digital triage systems and symptom checkers are evaluated primarily on diagnostic accuracy, failing to translate clinical assessments into actionable navigation paths under level-specific capacity constraints. Consequently, there is a design gap in how to coordinate patient flows across diverse ecosystem levels without resorting to hierarchical control.

Key findings.
– The central hypothesis that the distribution of recommended urgency levels significantly differs across ecosystem levels was statistically supported (p < 0.001).
– Emergency-level triage recommendations were significantly higher at the regional hospital level (33.6%) compared to the supra-regional insurance level (9.3%) and the local general practitioner level (6.5%).
– Self-care recommendations were highest at the local general practitioner level (41.6%), representing a 5.4-fold difference compared to the regional hospital level (7.8%).
– The study developed and validated four core design principles: Compliant Self-Service Navigation, Tripartite Separation of Concerns, Level-Adaptive Configuration, and Cross-Level Resource Orchestration.
– Limitations include a study context restricted to the German healthcare system with its rigid separation of ambulatory and inpatient care, and the lack of assessment of long-term downstream metrics like downstream provider workload.
What it means for you
  • CIO / IT Executive: On Monday morning, initiate an audit of your organization's digital triage and patient portal systems to mandate the decoupling of clinical assessment engines from scheduling and routing systems. Instruct your enterprise architects to draft an API specification that separates clinical logic from localized resource constraints, laying the groundwork for a Tripartite Separation of Concerns architecture that allows routing rules to change dynamically without breaking clinical safety protocols.
  • IT Manager: On Monday morning, open the configuration panel of your patient-facing portal and create three distinct routing profiles based on the user's entry point (e.g., insurance portal, hospital website, or GP clinic page). Adjust the routing logic so that users entering via the GP clinic page are primarily served digital self-care modules and local pharmacy directories (targeting the study's finding of 41.6% self-care suitability at the local level) to deflect low-urgency cases away from emergency channels.
  • Business Strategist: On Monday morning, draft a business proposal for a cross-organizational steering agreement between your regional hospital network and local general practitioner (GP) associations. Use the study’s finding of a 5.4-fold difference in self-care rates between GPs and hospitals to propose a shared-risk pilot program where low-acuity patients on the hospital's digital waitlist are automatically routed to GP-managed digital self-care, reducing hospital emergency room crowding and capturing shared savings.
  • Researcher: On Monday morning, write a research protocol and institutional review board (IRB) proposal to study the long-term downstream metrics of cross-level patient navigation, specifically focusing on provider workload. Set up a longitudinal data-sharing agreement with a local clinic network to track physician consult times, staff burnout scores, and EHR click-rates before and after the implementation of a level-adaptive triage system to address the key limitation of the HEO study.
  • Policymaker: On Monday morning, draft a policy memorandum proposing a regional 'Digital Health Integration Sandbox' that offers financial incentives (such as targeted billing codes) for hospitals and ambulatory clinics that share a common, level-adaptive digital triage interface. Design these incentives specifically to bypass the rigid regulatory separation of ambulatory and inpatient billing, addressing the systemic fragmentation highlighted in the German system study.
Healthcare, Health Ecosystem, Ecosystem Orchestration, Design Principles, Action Design Research, Digital Triage
Multimodal Event Log Construction for Process Mining: Instantiating a Reference Architecture
DESRIST 2026 · Part I

Multimodal Event Log Construction for Process Mining: Instantiating a Reference Architecture

Frank Sturm, Annina Liessmann and Martin Matzner
🎤 Lev Kenning & Lore Koestler
This study demonstrates and evaluates a reference architecture designed to construct process mining event logs from fragmented, multimodal data. By instantiating five core construction functions (parsing, abstraction and concretization, extraction, integration, and enrichment), the authors build a unified event log using CRM records, email metadata, and call transcripts. The system is evaluated via a real-world sales process case study in a small enterprise.

The problem. Organizations often store process data in highly fragmented and heterogeneous formats across different systems (such as CRMs, email clients, and phone records), preventing comprehensive process mining. Conventional approaches struggle to integrate these unstructured and structured modalities into a single coherent event log. This lack of integration leaves critical operational gaps, particularly in small and medium-sized enterprises that lack unified ERP systems.

Key findings.
– The instantiated architecture expanded the event log from a baseline of 9 activities and 142 events (CRM-only) to 23 activities and 723 events, providing significantly richer process insights.
– Temporal consistency was fully supported, with normalized timestamps achieving 100% monotone ordering across all 22 evaluated cases.
– Episode-based abstraction was supported, with 89.8% of evaluated conversational episodes rated as topically coherent by a domain expert.
– Automatic case resolution was supported, achieving an 86.4% correct automatic case assignment rate with the remaining 13.6% safely flagged as ambiguous for human review.
– Activity extraction labeling was only partially supported and showed mixed results; only 53.8% of proposed labels were rated as correct or mostly correct, with performance heavily dependent on the presence of strong keyword cues.
– Sentiment analysis at the episode level achieved a moderate accuracy of 68.3% (F1-score of 0.676), acting as an informative but noisy contextual signal.
– Study limitation: The evaluation is restricted to a single SME sales process and a small sample size of 22 cases.
What it means for you
  • CIO / IT Executive: Instruct your enterprise architecture team to map all API data endpoints for your CRM, email, and phone systems, and authorize a budget for a pilot project to integrate these fragmented sources into a unified event-log framework.
  • IT Manager: Write a script to extract, format, and normalize timestamps to UTC across both CRM database records and email metadata to achieve the 100% monotone temporal consistency demonstrated in the study.
  • Business Strategist: Meet with the sales operations lead to identify the top three blind spots in your CRM sales pipeline and define the specific email and call-transcript touchpoints required to map those hidden activities.
  • Researcher: Draft a methodology proposal for a study comparing zero-shot Large Language Model (LLM) labeling against keyword-based heuristics to overcome the 53.8% accuracy bottleneck in activity extraction.
  • Policymaker: Draft an internal data privacy directive requiring that any automated parsing of customer email metadata and call transcripts must strip out Personally Identifiable Information (PII) at the extraction stage before integration into process logs.
Event log construction, Multimodal event data, Customer relationship management, Activity extraction, Episode segmentation
Design Principles for Designing and Governing Hyperautomation Implementation
DESRIST 2026 · Part I

Design Principles for Designing and Governing Hyperautomation Implementation

Nicolas Neis
🎤 Liv Knowles & Leo Kant
This study employs a Design Science Research approach to formulate and evaluate prescriptive design principles for scaling and governing hyperautomation. The researcher derived 17 meta-requirements and 11 design principles from a structured literature review and 11 semi-structured expert interviews. The resulting design knowledge was evaluated using a mixed-methods approach combining a quantitative survey of 100 managers and ex-post qualitative expert feedback.

The problem. While organizations increasingly adopt hyperautomation to integrate diverse technologies, most existing frameworks focus only on task-level automation. There is a lack of actionable guidance on how to systematically coordinate heterogeneous technologies and manage human-AI collaboration in complex processes. This research gap often results in operational fragmentation, high integration debt, and scaling failures within enterprises.

Key findings.
– Survey results showed robust statistical support for the 17 identified implementation requirements, with an overall mean agreement score of 5.37 out of 7.
– The 11 proposed design principles were statistically supported with an overall mean agreement score of 5.21 out of 7, with 'security and compliance by design' (DP8) and 'risk-tiered exception governance' (DP4) receiving the highest endorsements.
– The principle prioritizing API-based integration over UI-based RPA (DP2) received the lowest statistical support (mean of 4.72, with only 54% of respondents agreeing), indicating perceived barriers like legacy constraints and limited API availability.
– Ex-post expert validation (n = 11) yielded even higher mean agreement scores (5.98 for requirements and 5.84 for principles), emphasizing that integration-centric discipline is crucial despite its perceived difficulty.
– A limitation of this study is that the evaluation is based on perceived agreement and expert critique rather than longitudinal performance metrics from actual field deployments.
What it means for you
  • CIO / IT Executive: On Monday morning, issue a directive mandating that all new automation projects default to API-based integrations (DP2) and institute a strict architecture review process requiring written justification for any team proposing a UI-based RPA solution.
  • IT Manager: On Monday morning, review your active automation inventory and implement a 'risk-tiered exception governance' (DP4) spreadsheet, assigning specific human-in-the-loop escalation paths for automation failures based on low, medium, and high operational risk tiers.
  • Business Strategist: On Monday morning, identify the top three legacy business processes currently relying on UI-based RPA and build a business case to secure budget for migrating them to API-based integrations to reduce integration debt and scaling failures.
  • Researcher: On Monday morning, draft a research proposal and outreach email to industry partners to launch a 12-month longitudinal field study tracking actual operational KPIs (e.g., system downtime, maintenance hours) of organizations implementing these 11 design principles.
  • Policymaker: On Monday morning, draft an internal corporate policy template requiring all enterprise-wide automation and AI deployments to undergo a mandatory 'Security and Compliance by Design' (DP8) audit before receiving production clearance.
Hyperautomation, Implementation, Socio-technical Systems Theory, Design Science Research, Design Principles, Governance
Generating Design Knowledge Without Doing Design: The Design Researcher as Curator
DESRIST 2026 · Part II

Generating Design Knowledge Without Doing Design: The Design Researcher as Curator

Pedro Antunes and Andreas Drechsler
🎤 Liv Knowles & Leo Kant
This study conceptualizes a new mode of engagement in Design Science Research (DSR) termed the 'curator', where researchers analyze and synthesize unstructured practitioner accounts to build generalizable design knowledge. To define and demonstrate this approach, the authors outline a structured curation process and apply it to an illustrative project analyzing systems engineering trends from professional blog posts.

The problem. Traditional DSR often positions the researcher as a direct developer of design artifacts, which can limit the speed and scale at which design knowledge is accumulated. Furthermore, valuable, emergent practices documented informally by practitioners on platforms like professional blogs remain difficult to systematically process, structure, and generalize within current DSR frameworks.

Key findings.
– The 'curator' mode of engagement is established as a conceptually distinct DSR role, specialized in generating abstract, filtered, and generalized knowledge at an accelerated pace relative to traditional player or observer roles.
– A structured five-step curation process comprising data sourcing, analysis, generalization, abductive logic, and boundary control is proposed and conceptually supported.
– Using a dataset of approximately 92,766 words from Medium.com, the curation method successfully synthesized a process-theory-based framework characterizing the evolution of the systems engineering design function across four defining eras: Agile adaptation, DevOps adaptation, platform alignment, and coexistence with Artificial Intelligence.
– The study outlines potential limitations of this method, including the risk of analyzing transient industry trends that might rapidly fade, the volatility of online source data, and the risk of generating artifacts that may lack long-term theoretical resonance.
What it means for you
  • CIO / IT Executive: Initiate an internal assessment to map your current systems engineering and IT infrastructure against the four operational eras (Agile, DevOps, platform alignment, AI coexistence) to identify legacy gaps in your AI integration strategy.
  • IT Manager: Create a 'practitioner curation' workflow where team leads spend 30 minutes extracting actionable DevOps and platform-alignment solutions from active tech blogs (like Medium) to address current system bottlenecks.
  • Business Strategist: Launch a rapid trend-analysis sprint applying the five-step curation process to competitor and industry developer blogs to identify emergent systems engineering trends before they are captured by traditional market research.
  • Researcher: Identify a target design problem and draft a research protocol to apply the five-step curation process on a corpus of professional blogs, allowing you to generate generalizable design knowledge without the overhead of building a physical artifact.
  • Policymaker: Designate a working group to curate unstructured industry blog posts and practitioner forums to continuously update safety and compliance standards for AI-coexistence systems, bypassing slow, traditional standards-writing cycles.
Design Science Research, Research Engagement, Curator, Systems Engineering, Grey Literature, Knowledge Accumulation
A Digital Twin-Based System to Support Urban Planners in Mitigating the Urban Heat Island Effect
DESRIST 2026 · Part II

A Digital Twin-Based System to Support Urban Planners in Mitigating the Urban Heat Island Effect

Iresha Bandaranayake, Dominik Siemon and Ram Gurung
🎤 Lev Kenning & Lore Koestler
This study develops and evaluates an interactive 3D digital twin system integrated with a machine learning model to simulate and forecast land surface temperatures. Using design science research, the authors built a prototype for the city of Lahti, Finland, to help urban planners virtually test the thermal effects of modifications like green roofs, water bodies, and vegetation.

The problem. As rapid urbanization exacerbates the urban heat island effect, city planners lack interactive, forward-looking decision support systems to evaluate thermal impacts prior to physical construction. Existing tools are largely descriptive and retrospective, failing to offer interactive workflows where practitioners can easily simulate and compare alternative planning interventions.

Key findings.
– The optimized machine learning model achieved high predictive performance, with technically supported accuracy metrics of R² = 0.96, RMSE = 1.16°C, and a correlation coefficient of r = 0.98.
– User evaluations supported the system's usability (mean 3.92/5) and usefulness (mean 4.12/5) as an interactive tool to compare alternative planning scenarios with stakeholders.
– Trust in the prediction results was only moderate (mean 3.70/5), indicating that high predictive performance alone does not ensure user confidence and that system designs must explicitly incorporate transparency and data explainability.
– The behavioral evaluation is limited by a small sample size of five domain experts and students, which restricts statistical generalizability.
What it means for you
  • CIO / IT Executive: Draft and issue an IT procurement directive requiring that all future digital twin and predictive spatial-modeling software purchases include native Explainable AI (XAI) components, such as interactive feature-importance visualizations, to directly address the user trust deficit identified in the Lahti study.
  • IT Manager: Schedule a kick-off meeting with your software development team to add hover-over 'explainability tooltips' and confidence intervals to the 3D digital twin's predictive temperature layers, allowing users to see exactly how the ML model calculated the thermal impact of a proposed green roof or water body.
  • Business Strategist: Create a business case proposal for local real estate developers outlining how using this predictive 3D simulation tool during pre-construction phases can reduce municipal heat mitigation compliance costs and improve public relations by virtually validating green-roof and vegetation impacts prior to breaking ground.
  • Researcher: Write and submit a research proposal to expand the study's user evaluation pool from 5 to at least 50 urban planning professionals across different Nordic cities, establishing a standardized testing protocol to overcome the small-sample limitation and validate the tool's behavioral generalizability.
  • Policymaker: Draft an agenda item for the next city council meeting proposing a policy mandate that requires major urban redevelopment projects to submit a digital microclimate simulation report (using validated predictive models) to prove they will not exacerbate the local urban heat island effect before receiving zoning approval.
Digital Twin, Urban Heat Island, Land Surface Temperature, Machine Learning, Urban Planning, Design Science Research
Collective Action Planning: A Method to Plan and Implement Circular Ecosystems
DESRIST 2026 · Part II

Collective Action Planning: A Method to Plan and Implement Circular Ecosystems

Anna Margolis and Fenna Blomsma
🎤 Lev Kenning & Lore Koestler
This study outlines the development and qualitative validation of the Collective Action Planning (CAP) method, a five-step facilitation framework designed to translate abstract multi-stakeholder visions into practical, near-term actions. The framework was iteratively refined and validated through sixteen cycles in academic, expert, and real-world project settings.

The problem. Multi-stakeholder initiatives often struggle to bridge the gap between long-term sustainability visions and immediate operational plans. Misaligned goals, complex ecosystems, and present-day constraints frequently lead to analysis paralysis or fragmented, uncoordinated efforts.

Key findings.
– The Collective Action Planning (CAP) method was qualitatively validated across 16 cycles, showing utility in helping partners transition from shared abstract visions to concrete operational plans.\n- Facilitators reported that CAP successfully helped uncover hidden challenges, structured constructive stakeholder dialogues, and clarified expected learning outcomes.\n- Evaluation revealed two distinct adoption patterns: a structured approach that builds a comprehensive system perspective, and a flexible approach that uses specific prompts to generate creative solutions.\n- Limitation: The method has only been validated within the context of circular ecosystems in European geographies, and its applicability to other industries or non-European cultural contexts remains untested.
What it means for you
  • CIO / IT Executive: Launch a pilot project to audit the IT department's e-waste and hardware recycling lifecycle by scheduling a 90-minute workshop with your hardware vendors and disposal partners, utilizing the 5-step Collective Action Planning (CAP) framework to define concrete API-based tracking integrations instead of vague sustainability goals.
  • IT Manager: Set up a whiteboard session with your systems integration team using CAP's 'flexible prompting' approach to brainstorm and map out immediate, low-code data-sharing protocols that bridge your current ERP system with an external circular supply chain partner.
  • Business Strategist: Review the agenda of your next multi-stakeholder circular economy partnership meeting and restructure it around the 5-step CAP framework, forcing partners to translate abstract sustainability KPIs into a structured, 30-day operational action plan with clearly defined 'expected learning outcomes' for each participant.
  • Researcher: Draft a comparative research proposal to apply the CAP methodology to a non-European supply chain ecosystem, specifically contacting a research partner in North America or Asia to run a pilot workshop and test if the framework’s five steps require cultural adaptation.
  • Policymaker: Revise the application guidelines for the upcoming municipal circular economy grant cycle to mandate that multi-stakeholder applicants submit a concrete 90-day operational plan generated via the CAP framework, rather than just abstract long-term environmental visions.
circular economy, complexity, collective innovation, cross-sector partnerships, ecosystems, multi-stakeholder collaboration
Green Lean Six Sigma, A Modeling Tool-Based Approach
DESRIST 2026 · Part II

Green Lean Six Sigma, A Modeling Tool-Based Approach

Florian Johannsen and Hans-Georg Fill
🎤 Lev Kenning & Lore Koestler
This study proposes a conceptual modeling tool-based approach to Green Lean Six Sigma (GLSS) to support organizations in redesigning business processes for environmental sustainability. Developed as a design science research artifact using the ADOxx metamodeling platform, the prototype integrates environmental metrics into eleven traditional quality techniques. The solution was evaluated using illustrative manufacturing scenarios, expert interviews, and a System Usability Scale (SUS) study.

The problem. Despite growing demand for resource-efficient business processes, there is currently no universally accepted Green Lean Six Sigma standard or structured framework. Consequently, practitioners lack clear guidelines on how to modify traditional quality techniques for environmental goals. Furthermore, there is insufficient research on how to effectively codify and document GLSS project results using modeling software.

Key findings.
– The design of 11 GLSS conceptual model types successfully demonstrates how environmental metrics (such as the 'Voice of the Environment') can be integrated into traditional quality management techniques.
– In a System Usability Scale (SUS) study with 10 participants, 90% of users rated the prototype highly in terms of being easy to use, easy to learn, and instilling confidence.
– Qualitative feedback from three process management and BI experts confirmed that the modeling tool effectively supports all phases of the DMAIC cycle and enhances practitioners' understanding of environmental inputs and outputs.
– The research is limited by its early-stage iteration, a small evaluation sample (10 students as proxies), and a limited set of quality techniques, with real-world field evaluations planned for future work.
What it means for you
  • CIO / IT Executive: Direct your enterprise architecture team to download the ADOxx metamodeling platform and conduct a feasibility assessment on integrating 'Voice of the Environment' data fields into your organization's existing Business Process Management (BPM) software.
  • IT Manager: Create a sandbox environment on the ADOxx platform and configure a basic SIPOC (Suppliers, Inputs, Process, Outputs, Customers) model template that includes dedicated fields for tracking energy consumption and carbon emissions for each process step.
  • Business Strategist: Review your active operational improvement projects and mandate that the next DMAIC project charter must explicitly define 'Voice of the Environment' (VoE) metrics, such as waste generation or resource consumption, alongside traditional cost and quality targets.
  • Researcher: Draft a collaboration proposal to a local manufacturing partner to run a comparative field study, testing the 11 Green Lean Six Sigma model types in an active production environment to validate the prototype beyond the initial student-proxy sample.
  • Policymaker: Initiate a working group with regional industrial standards bodies to draft a standardized framework that formally incorporates Green Lean Six Sigma modeling guidelines into local environmental compliance and ISO 14001 certification audits.
Green Lean Six Sigma, Quality Technique, Metamodel, Design Science Research, Conceptual Modeling, Business Process Management
Critical Incident Technique for Semi-Naturalistic DSR Evaluation: A Methodology and an Illustration
DESRIST 2026 · Part II

Critical Incident Technique for Semi-Naturalistic DSR Evaluation: A Methodology and an Illustration

Charon Abbott, Mary Tate, Wasana Bandara and Lisa Tam
🎤 Lev Kenning & Lore Koestler
This study outlines and demonstrates the use of the Critical Incident Technique (CIT) as a semi-naturalistic evaluation approach in Design Science Research (DSR). It presents a five-step methodology and illustrates its application using a case study of a stakeholder engagement tool for Business Process Management (BPM) projects. The paper assesses the technique's feasibility, participant experiences, and researcher advantages.

The problem. Evaluating DSR artifacts in fully naturalistic, real-world settings is often expensive, time-consuming, and difficult to execute. Conversely, artificial evaluations like simulations often lack real-world validity and fail to engage participants deeply. There is a gap in existing methods for an intermediate approach that captures context-sensitive insights while maintaining feasibility.

Key findings.
– Verisimilitude: Qualitative data from practitioner interviews supported that using recalled real-world critical incidents provides high authenticity and realism compared to artificial scenarios.
– Increased Engagement: Participants demonstrated greater engagement and interest when evaluating the artifact against real incidents they personally experienced rather than simulated scenarios.
– Deep Reflection: The CIT method was shown to encourage deep reflection, allowing participants to uncover unexpected insights and shift their perspective on past incidents.
– Resource and Cost Efficiency: From the researchers' perspective, the CIT evaluation was highly resource-efficient, requiring only 20 hours of participant interaction to obtain rich data that would be costly and complex to capture in a fully naturalistic field study.
– Limitations: The findings are qualitative and illustrative, based on a small sample of 10 practitioners, and do not empirically establish superiority or statistical impact over alternative evaluation approaches.
What it means for you
  • CIO / IT Executive: Before signing off on the budget for a major enterprise software rollout, instruct your enterprise architecture team to halt the planned expensive vendor-led pilot and instead run a 20-hour 'Critical Incident' review where key department heads test the new platform against three actual, severe system outages or operational failures your company experienced last year.
  • IT Manager: To evaluate a new system monitoring tool before deployment, schedule a 2-hour session this afternoon with three senior DevOps engineers, ask them to recall their most stressful system crash from the last six months, and have them walk through exactly how they would diagnose and resolve that specific past incident using the new tool.
  • Business Strategist: Review your proposed Business Process Management (BPM) redesign by convening a 90-minute workshop with front-line managers; instead of walking through idealized flowcharts, ask them to stress-test the new process against their three worst client onboarding disasters of the past quarter to identify practical gaps.
  • Researcher: Revise the evaluation chapter of your current Design Science Research (DSR) manuscript to replace the planned artificial simulation with a Critical Incident Technique (CIT) protocol, and draft an interview guide that prompts ten target practitioners to walk through your prototype using their own recalled industry crises.
  • Policymaker: Update your department’s IT procurement and software audit guidelines to mandate a 'historical critical incident' testing phase, requiring vendors to demonstrate how their solutions would have mitigated or resolved three specific, documented government system failures from the past five years.
Design Science Research (DSR), Critical Incident Technique, Evaluation, Case study, Semi-Naturalistic Evaluation, Business Process Management
Designing Mobile Applications for Spontaneous Volunteers: Insights from a Design Science Research Approach
DESRIST 2026 · Part II

Designing Mobile Applications for Spontaneous Volunteers: Insights from a Design Science Research Approach

Enrico Milutzki, Lucas Memmert, Marten Borchers, Ramazan Zeybek, Valeria Magdych, Martin Semmann and Eva A. C Bittner
🎤 Liv Knowles & Leo Kant
This study investigates the design of mobile applications to coordinate spontaneous volunteers during storm surge crises. The researchers followed a design science research methodology to build an interactive mobile prototype and qualitatively evaluated it with citizens and emergency response experts.

The problem. Spontaneous volunteers provide critical help during disasters, but integrating them into official response efforts is difficult due to unstructured communication and safety hazards. Current mobile warning apps are limited to one-way alerts, leaving a gap in two-way coordination and collaborative volunteer management.

Key findings.
– A centralized, structured information dashboard and stepwise registration process were qualitatively supported by both citizens and experts as effective for quick onboarding.
– Concise digital safety briefings were qualitatively supported as valuable preparation tools, though experts noted they cannot fully replace on-site physical instruction.
– Peer-to-peer communication channels received mixed support, with volunteers valuing self-organization while experts raised concerns about misinformation and lack of moderation.
– Gamification and motivational features (such as progress badges) received mixed support, as some volunteers found them motivating while others deemed them inappropriate for disaster contexts.
– Study limitations include a small qualitative sample of 12 digitally literate participants and evaluation based on simulated walkthroughs rather than real-world deployment.
What it means for you
  • CIO / IT Executive: Instruct your development leads to immediately halt any roadmap initiatives for gamification features (such as progress badges) in your crisis response applications, redirecting those engineering resources to build a secure, centralized data dashboard.
  • IT Manager: Schedule a kickoff meeting with your UX/UI design team to map out a simplified, stepwise mobile onboarding flow and draft a concise, mandatory digital safety briefing screen for the volunteer application.
  • Business Strategist: Review the product requirement document (PRD) for the volunteer app and replace unmoderated peer-to-peer chat features with a moderated, one-way official broadcast channel to mitigate the operational risk of misinformation during crises.
  • Researcher: Draft a research proposal and partner with a local emergency management agency to plan a field study that tests the mobile prototype with at least 50 non-tech-savvy users during an upcoming physical disaster simulation to address the current study's digital literacy bias.
  • Policymaker: Draft an amendment to the municipal emergency response protocol mandating that any digital volunteer coordination tool must utilize a hybrid onboarding model requiring both a digital safety briefing in-app and a verified physical safety sign-off at the physical deployment site.
Crisis Response, Spontaneous Volunteers, Mobile Application, Design Science Research, Disaster Management
Designing Immersive Game-Based Pretraining for Business Simulations: An Action Design Research Study
DESRIST 2026 · Part II

Designing Immersive Game-Based Pretraining for Business Simulations: An Action Design Research Study

Anna Wenzel, Jan-Martin Geiger and Andreas Liening
🎤 Liv Knowles & Leo Kant
This study utilizes an Action Design Research approach to develop and refine an immersive, 3D game-based pretraining environment on the Roblox platform. The researchers evaluated the design across two cycles, first with an alpha-version exploratory test among student teachers (n=12) and subsequently via a large-scale field study with university students (n=82 experimental, n=107 control) to observe cognitive and motivational readiness.

The problem. Novice learners often face severe cognitive overload and motivational anxiety when introduced to highly complex business simulation games. Traditional pretraining strategies typically rely on passive instructional formats like video or text, which do not actively engage learners. Consequently, there is a design gap in understanding how interactive, immersive virtual worlds can be systematically structured to prepare students cognitively and motivationally without inducing additional extraneous cognitive load.

Key findings.
– The evaluation was exploratory and interpretive in nature, meaning no formal statistical significance testing or causal inferences were conducted to compare the groups.\n- Descriptively, participants across the immersive pretraining groups reported low levels of extraneous cognitive load and moderate-to-high germane cognitive load during the pretraining phase.\n- Enjoyment was descriptively high and anxiety remained low across all pretraining groups, but perceived autonomy was relatively low due to the highly linear progression of the task structure.\n- A technical breakdown during the field deployment showed that more than half of the students who encountered a bug did not return after the system was fixed, demonstrating a practical need for robust auto-save and progression recovery features in virtual learning environments.
What it means for you
  • CIO / IT Executive: Issue a policy directive to your enterprise architecture team mandating that all future immersive 3D training and simulation platforms must include automatic progress-saving and session-state recovery features to prevent user attrition during technical disruptions.
  • IT Manager: Audit your organization's current training simulation platforms on Monday morning, identify any lacking auto-save or quick-recovery capabilities, and log a high-priority ticket to implement telemetry that tracks where users drop off during system glitches.
  • Business Strategist: Review the curriculum of your upcoming employee onboarding simulations and introduce branching choice points into the training modules to increase learner autonomy, moving away from highly linear task structures that decrease engagement.
  • Researcher: Draft a research protocol to conduct a rigorous, pre-registered randomized controlled trial (RCT) with a larger sample size to quantitatively evaluate the causal effects of immersive pretraining versus traditional text/video formats on cognitive load.
  • Policymaker: Create and distribute a formal educational technology standard requiring public training programs to evaluate learning tools for cognitive load metrics (extraneous vs. germane) and to establish strict technical recovery thresholds before wide-scale deployment.
Action Design Research, Pretraining, Immersive Environment, Business Simulation Game, Cognitive Load, Instructional Design
Design Science Research in an Era of Generative AI, Challenges and Theoretical Guidelines
DESRIST 2026 · Part II

Design Science Research in an Era of Generative AI, Challenges and Theoretical Guidelines

Philipp Zur Heiden, Daniel Beverungen, Christian Bartelheimer and Christoph Breidbach
🎤 Lev Kenning & Lore Koestler
This study presents an integrative literature review of 64 conference papers to evaluate how researchers design information technology artifacts using Large Language Models (LLMs). The authors analyze current design practices to identify challenges and propose theoretical guidelines for generative AI integration. The study is limited to conference papers published in 2025.

The problem. Designers increasingly integrate generative AI into systems, but they frequently treat these complex models as black boxes. This obscures what is actually designed versus what the underlying AI naturally provides, threatening the reproducibility and durability of design knowledge. Consequently, there is a lack of methodological guidance to navigate the non-deterministic nature and rapid evolution of generative AI.

Key findings.
– Descriptive analysis showed that 82.8% (53 out of 64 papers) black-boxed the design features of the LLM, failing to describe its core functionality, reasoning, or behavior in detail.
– Relying on prompting without further model contextualization was the dominant approach, observed in 60.9% (39 papers) of the reviewed studies, whereas only 32.8% (21 papers) contextualized their LLM applications.
– Attempts to generalize prescriptive design knowledge were absent in 60.9% (39 papers) of the sample, which did not abstract findings into formal design principles.
– The study conceptually synthesized five primary challenges (obscure composition, opaque contextualization, fragile consistency, knowledge erosion, and lack of guidance) and proposed three corrective guidelines to help researchers structure LLM-enabled artifact design.
– A noted limitation is that the literature review focused exclusively on 2025 conference publications, meaning journal-level publications with potentially more mature methodologies were not analyzed.
What it means for you
  • CIO / IT Executive: Establish a mandatory 'GenAI Artifact Registry' policy requiring all IT project leads to document the exact system architecture, specific model versions, and contextualization methods (like RAG) used in active pilots to prevent black-box dependencies.
  • IT Manager: Conduct a code and architecture review of your team's active LLM projects to identify instances of 'naked prompting' and mandate that developers implement and document explicit system-level guardrails and contextualization mechanisms.
  • Business Strategist: Review the business cases of your pipeline GenAI initiatives to assess dependency risks, rejecting projects that rely solely on basic API prompting in favor of those building proprietary value through custom workflows and data integration.
  • Researcher: Open your current GenAI research manuscript and draft a dedicated 'Design Principles' section that explicitly details the LLM's prompt templates, model versions, and reasoning flows to ensure your findings are reproducible and generalizable.
  • Policymaker: Draft a new clause for your IT procurement and compliance guidelines requiring external AI vendors to submit detailed documentation disclosing prompt methodologies, underlying model dependencies, and consistency-testing protocols.
Design Science Research, Generative Artificial Intelligence, Large Language Model, Fit and Misfit, Guidelines, Literature Review
Design Science at Scale: Applying Consortium Research Methodology in Open Source Ecosystems
DESRIST 2026 · Part II

Design Science at Scale: Applying Consortium Research Methodology in Open Source Ecosystems

Markus Spiekermann, Julia Pampus and Boris Otto
🎤 Lev Kenning & Lore Koestler
This study explores how the Consortium Research (CR) methodology can be applied to large-scale, enterprise-grade open source projects. Using the Eclipse Dataspace Components (EDC) project as a qualitative case study, the researchers examine how combining CR with open-source practices bridges the gap between academic rigor and practical industry relevance.

The problem. Sustaining both scientific rigor and practical relevance in Design Science Research is challenging in highly dynamic, multi-organizational, and technology-intensive fields. Traditional collaborative research methods are typically localized and lack the structured governance needed to scale and accumulate design knowledge across diverse stakeholders.

Key findings.
– The study relies on a qualitative, interpretive single-case research design rather than statistical testing; therefore, findings are supported qualitatively through case analysis rather than statistical significance.
– Combining Consortium Research (CR) with open source practices was shown to be a viable approach to scale collaborative design science research, enabling over a dozen organizations to contribute to a shared, production-ready artifact.
– Open source governance models (such as those of the Eclipse Foundation) effectively complement CR's role structure by providing structured mechanisms for decision-making, contribution management, and IP handling.
– Major collaborative challenges were identified, including personnel turnover, misalignment of short-term and long-term goals, and hidden strategic agendas, which require active management and soft skills to resolve.
– Study Limitation: Generalizability is limited as the findings are based on a single in-depth case study of the Eclipse Dataspace Components project within a specific industry context.
What it means for you
  • CIO / IT Executive: Review your organization's active multi-partner R&D projects and initiate a transition of at least one collaborative software initiative to a mature open-source foundation (like Eclipse) to leverage established IP handling and structured decision-making governance.
  • IT Manager: Create a dedicated onboarding document and a bi-weekly 'alignment sync' for your developers contributing to shared open-source consortia to actively track and mitigate the impacts of personnel turnover and shifting partner agendas.
  • Business Strategist: Analyze your industry's current open-source dataspace initiatives (such as Eclipse Dataspace Components) and draft a strategic proposal to join a relevant consortium, shifting R&D costs to a shared, production-ready industry standard.
  • Researcher: Embed your next Design Science Research (DSR) artifact within an active open-source ecosystem, using the host project's issue tracker and pull-request reviews as a qualitative feedback loop to validate your design knowledge.
  • Policymaker: Amend the compliance requirements for upcoming public-private technology grants to mandate that collaborative software outputs must use an established open-source governance model to ensure long-term viability and prevent proprietary capture.
Design Science Research, Consortium Research, Open Source Ecosystems, Eclipse Dataspace Components, Multi-Stakeholder Collaboration, Knowledge Transfer, Dataspaces
Beyond the Golden Record: Toward a Design Theory for Trustworthy Master Data Management with Self-sovereign Identity
DESRIST 2026 · Part II

Beyond the Golden Record: Toward a Design Theory for Trustworthy Master Data Management with Self-sovereign Identity

Niklas Schulte, Isaac Henderson Johnson Jeyakumar, Michael Kubach and Christian Janiesch
🎤 Liv Knowles & Leo Kant
This study applies a Design Science Research (DSR) approach to develop and evaluate a design theory for cross-organizational master data management. The researchers combined a hermeneutic literature review with semi-structured industry expert interviews to design a framework and integrate it into a reference architecture for decentralized data spaces.

The problem. Traditional master data management depends on centralized registries or commercial data brokers to ensure data quality, creating strategic dependencies and compliance risks. Existing decentralized identity technologies lack the semantic integration required to manage master data, leaving a gap in how organizations can maintain accurate and sovereign data exchange without central intermediaries.

Key findings.
– The design theory, comprising five Design Principles (DPs) and eight Design Features (DFs), was qualitatively supported by industry experts who validated its utility in establishing organizational trust, semantic interoperability, and usage policy integration.
– The feasibility of the design was supported through successful integration and instantiation within the International Data Space Association (IDSA) Reference Architecture Model 4.0.
– Expert evaluation highlighted challenges (not yet statistically tested) regarding legal enforceability across jurisdictions, operational complexity, and key management overhead.
– The study's findings are limited by a small qualitative evaluation sample (four experts) and a primary focus on technical feasibility over broader empirical business adoption.
What it means for you
  • CIO / IT Executive: Review your organization's current spending and compliance risks associated with centralized commercial data brokers, and commission a feasibility study on adopting the IDSA Reference Architecture Model 4.0 for a decentralized supplier master data pilot.
  • IT Manager: Set up a sandbox environment to evaluate the key management overhead and technical complexity of integrating Self-Sovereign Identity (SSI) with your existing Master Data Management systems, using the study's five Design Principles as a blueprint.
  • Business Strategist: Schedule a meeting with your top three supply chain partners to propose a decentralized data-sharing consortium, using the study's trust-building framework to pitch the elimination of middleman data broker fees.
  • Researcher: Draft a quantitative survey methodology to test the study's 5 Design Principles and 8 Design Features across a larger sample of 100+ IT architects, specifically measuring the perceived operational complexity of decentralized identity management.
  • Policymaker: Draft a policy brief proposing a regulatory sandbox to address the cross-jurisdictional legal enforceability of decentralized master data exchanges and self-sovereign identities under local data governance laws.
Master Data Management, Self-Sovereign Identity, Data Ecosystem, Data Space, Data Sovereignty
Design Principles for Ethical Automated Mental Workload Monitoring in the Industrial Internet of Things
DESRIST 2026 · Part II

Design Principles for Ethical Automated Mental Workload Monitoring in the Industrial Internet of Things

Fatma Demircan, Maximilian Nebel and Christian Janiesch
🎤 Lev Kenning & Lore Koestler
This study develops and evaluates design principles for the ethical deployment of automated mental workload monitoring systems (Auto-MWMS) in industrial environments. Using design science research, the authors construct a conceptual framework from literature and refine it through qualitative interviews with IT and industrial experts. The resulting principles aim to balance operational efficiency with human-centric values like privacy and autonomy.

The problem. Integrating humans-in-the-loop within the Industrial Internet of Things (IIoT) increases cognitive demands, highlighting the need to monitor workers' mental workload to prevent errors. However, such automated monitoring risks introducing intrusive surveillance, violating worker privacy, and compromising employee autonomy. While existing literature discusses these ethical challenges, a practical gap remains in translating these abstract concerns into actionable system design principles.

Key findings.
– All nine developed design requirements, including compliance with legal regulations, respect for privacy, and minimization of errors, were qualitatively supported by domain experts as comprehensively capturing the key ethical challenges of Auto-MWMS.
– All eight proposed design principles (such as restricted data use, pausing of data collection, and customizable worker permissions) were qualitatively supported by practitioners as understandable, relevant, and suitable for system design.
– There was no statistical hypothesis testing performed, as the study relied entirely on qualitative expert evaluation.
– The study acknowledges limitations, including a small sample size of three expert interviews and the fact that the design principles have not yet been implemented or tested in real-world IIoT systems.
What it means for you
  • CIO / IT Executive: Update the procurement and system architecture guidelines for all upcoming IIoT monitoring projects to mandate that any automated mental workload monitoring system (Auto-MWMS) must include native features for data-collection pausing and customizable worker privacy permissions.
  • IT Manager: Schedule a meeting with your software development team to review the interface designs of your IIoT systems, and instruct them to integrate a prominent, easily accessible 'Pause Data Collection' toggle for operators on the shop floor.
  • Business Strategist: Draft an operational policy document on 'restricted data use' that explicitly defines how mental workload data will be used solely for real-time safety and task-reallocation, and strictly prohibits its use in annual performance evaluations or disciplinary actions.
  • Researcher: Write a research proposal and design an experimental protocol to test the study's eight ethical design principles in a live, real-world IIoT pilot environment, focusing on gathering quantitative data on worker trust and system error rates to address the current gap in empirical testing.
  • Policymaker: Draft an organizational compliance standard for automated worker monitoring that legally and operationally requires systems to provide employees with granular consent options and the right to temporarily opt-out of cognitive data collection during their shifts.
Design Principles, Ethics, IIoT, Mental Workload, Automated Monitoring, Design Science Research
Orchestrating Scaffolding AI Agents: Design Principles for Mechanism-Specific Learner Support
DESRIST 2026 · Part II

Orchestrating Scaffolding AI Agents: Design Principles for Mechanism-Specific Learner Support

Diana Kozachek and Andreas Janson
🎤 Lev Kenning & Lore Koestler
This study uses a Design Science Research approach to develop and evaluate a multi-agent AI tutoring system that orchestrates specialized scaffolding mechanisms (conceptual, procedural, strategic, and metacognitive). The prototype was iteratively refined with researchers, educators, and students, followed by a quantitative analysis of scaffold effectiveness in a controlled concept mapping task.

The problem. Generic generative AI models handle user queries uniformly, which often encourages cognitive offloading and passive copying rather than active skill development. Standalone AI tools lack pedagogical grounding, are unable to sequence different support mechanisms dynamically, and do not maintain persistent learner profiles to gradually reduce assistance over time.

Key findings.
– Metacognitive scaffolds delivered first in the tutoring sequence produced the highest concept map growth (M = 2.66, SD = 3.50, Cohen's d = 0.96), which was statistically supported (95% CI [1.24, 1.91], excluding zero).
– Scaffolds delivered later in the sequence (strategic, procedural, and conceptual) did not yield statistically significant effects on learning behavior, as their 95% confidence intervals overlapped with zero.
– A focus group evaluation demonstrated that simultaneous exposure to a novel interface, complex tasks, and multi-agent interactions can cause cognitive overload in novice learners.
– The study's limitations include a formative evaluation context, a student sample restricted to one European business school, and pending summative learning outcome measurements.
What it means for you
  • CIO / IT Executive: Direct your enterprise architecture team to audit all custom internal GenAI chatbot system prompts and mandate a 'metacognitive-first' routing rule that forces the AI to ask users to clarify their objective or thought process before providing direct answers, thereby mitigating passive cognitive offloading across the organization.
  • IT Manager: Simplify the user interface of your current AI deployment by hiding behind-the-scenes multi-agent workflows and backend reasoning logs, presenting novice users with a single, clean chat interface to prevent cognitive overload.
  • Business Strategist: Update your customer-onboarding and employee-training platform roadmaps to replace 'instant-answer' AI help desks with a sequential 'guided tutoring' agent, prioritizing metacognitive prompts in the initial user interaction phase to increase training completion rates and long-term skill retention.
  • Researcher: Draft a pre-registration protocol to replicate this multi-agent scaffolding study using a diverse, cross-institutional student sample, adding a delayed post-test to measure long-term knowledge retention and address the limitations of the current formative evaluation.
  • Policymaker: Draft and insert a new clause into your organization's AI Procurement Guidelines requiring educational software vendors to prove their AI systems include active learning mechanisms (like metacognitive prompting) and prohibit design features that facilitate passive, direct copy-pasting by students.
AI Agents, Orchestration, Design Science Research, Scaffolding, Multi-Agent Systems, Educational Technology
Developing Design Principles: Navigating the Design Knowledge Space with a Mode-Based and Abstraction-Aligned Framework
DESRIST 2026 · Part II

Developing Design Principles: Navigating the Design Knowledge Space with a Mode-Based and Abstraction-Aligned Framework

Timo Strohmann and Robert Winter
🎤 Lev Kenning & Lore Koestler
This study proposes a conceptual framework to guide the development of design principles within design science research. It introduces a mode-based reasoning framework, an abstraction-alignment model, and five navigation moves to help researchers systematically align design requirements, principles, and features. The utility of the framework is demonstrated through a retrospective reconstruction of a longitudinal research project on virtual companionship.

The problem. Developing design principles that are both actionable and generalizable is a persistent challenge in design science research. Existing methodologies often focus on rigid activity sequences or structural formulas without explaining how researchers actually navigate different abstraction levels. Consequently, developed principles are frequently either too abstract to offer concrete guidance or too closely tied to specific implementations to be transferable.

Key findings.
– The study conceptually supports a mode-based framework that models design principle development as a dynamic, iterative process across Framing, Reflection, and Synthesis modes.\n- The abstraction-alignment model conceptually supports that design principles are most stable when design requirements, principles, and features align at a compatible abstraction level (referred to as the conceptual sweet spot).\n- The framework identifies and outlines five recurring epistemic actions (Abstraction, Specialization, Instantiation, Alignment, and Stabilization) that describe horizontal and vertical movements within the design knowledge space.\n- No statistical hypotheses were proposed or quantitatively tested in this conceptual study.\n- The primary limitation is that the framework was illustrated using a single retrospective reconstruction of a virtual companion project rather than validated across multiple live research domains.
What it means for you
  • CIO / IT Executive: Review the architecture blueprints of your top-priority digital transformation project on Monday morning. Task your lead enterprise architect with creating an 'abstraction-alignment' matrix that explicitly maps your high-level business goals to mid-level architectural design principles and specific software features to identify and eliminate mismatch risks before funding the next phase.
  • IT Manager: Gather your development team for a 30-minute session on Monday morning to review the current sprint backlog. Select one complex feature and trace it vertically to verify that its technical implementation directly aligns with a defined project design principle and a high-level user requirement, resolving any gaps where features have drifted from the original project intent.
  • Business Strategist: On Monday morning, audit the product design guidelines used for your digital product portfolio. Replace vague, abstract strategic goals (e.g., 'enhance user engagement') with mid-level, actionable design principles that explicitly state *how* a specific product capability translates strategic business requirements into concrete user-facing features.
  • Researcher: On Monday morning, open your current Design Science Research (DSR) manuscript draft and map your artifact's development process against the five epistemic moves (Abstraction, Specialization, Instantiation, Alignment, Stabilization) to systematically justify and document how you navigated from requirements to features.
  • Policymaker: On Monday morning, review the draft of your organization's IT governance or compliance standards. Revise the guidelines to avoid dictating rigid, rapidly outdated technical feature specifications; instead, define middle-tier 'design principles' that translate abstract policy objectives into actionable but flexible implementation constraints for system developers.
Design Science Research, Design Principles, Design Knowledge, Abstraction, Synthesis, Design Features
Toward a Science of the Unexpected: Serendipity as an Epistemic Mechanism in Design Science Research
DESRIST 2026 · Part II

Toward a Science of the Unexpected: Serendipity as an Epistemic Mechanism in Design Science Research

Hanna Buyssens
🎤 Lev Kenning & Lore Koestler
This conceptual study proposes a theoretical framework to integrate serendipity into Design Science Research (DSR). The author develops a structured, four-phase model and guiding principles to systematically recognize, evaluate, and transform unexpected design outcomes into rigorous knowledge.

The problem. Traditional DSR methodologies treat unexpected findings and anomalies as deviations or noise to be resolved through iterative refinement. This orientation restricts researchers from leveraging emergence and surprise as productive sources for reframing problems and generating discovery, particularly in highly uncertain or complex environments.

Key findings.
– This is a purely conceptual and theory-building paper; it contains no empirical testing, statistical hypotheses, or quantitative data.
– Conceptualized a four-phase serendipity cycle model that details how designers can transition an anomaly from noise to signal: (1) encountering an unexpected trigger, (2) recognizing and reflecting, (3) making an abductive leap, and (4) materializing the insight into design knowledge.
– Proposed three guiding principles to enable serendipitous discovery: constructing artifacts with deliberate incompleteness, cultivating a prepared mind, and practicing collective, concurrent, and reflexive evaluation.
– Identified limitations of the framework, noting that the boundary conditions between the proposed phases are not yet empirically tested, decision gates for when to pursue versus park anomalies are not fully operationalized, and structural funding regimes may constrain its adoption.
What it means for you
  • CIO / IT Executive: Revise the standard IT pilot project charter template on Monday morning to include an 'Intentional Incompleteness' section, dedicating 10% of the project's resource budget to unresolved, open-ended design features to allow for user-driven adaptation and unexpected system insights.
  • IT Manager: In your Monday morning team stand-up, dedicate the final 10 minutes to a 'Weekly Anomalies' round-robin, asking developers to share one unexpected system behavior or user workaround they observed last week, and select one to explore for potential system reframing rather than immediately patching it.
  • Business Strategist: On Monday morning, pull customer service logs from the past month and search specifically for 'unusual workarounds' or 'off-label' product uses, then schedule a 30-minute afternoon session with product leads to evaluate if these unexpected user behaviors can be monetized into new features.
  • Researcher: Review your active prototype design on Monday morning and intentionally remove one finalized, non-essential user interface element, then schedule usability tests for later in the week to observe how users organically fill this deliberate functional gap.
  • Policymaker: On Monday morning, draft a proposal to revise the internal R&D funding guidelines, establishing a 'Pivot Amendment' that officially permits research grant recipients to reallocate up to 20% of their funding toward investigating anomalous, unplanned experimental findings without requiring bureaucratic re-approval.
Serendipity, Design Science Research, Abduction, Epistemic Objects, Iteration, Emergence, Reframing
Design Principles for Trauma-Informed Information Systems for Refugees
DESRIST 2026 · Part II

Design Principles for Trauma-Informed Information Systems for Refugees

Olena Ocheredko, Dominik Siemon and Jamile Teles Hamideh
🎤 Liv Knowles & Leo Kant
This study adopts a trauma-informed design perspective to investigate how integration-related information systems are experienced by refugees. Following an echelons design science research approach, the authors analyzed qualitative data from 35 semi-structured interviews with Ukrainian adults residing across 22 Finnish cities to formulate actionable system design principles.

The problem. Existing public-sector information systems are typically built on rational-user and efficiency assumptions that ignore how trauma and displacement affect cognition, emotion, and trust. As a result, routine digital administrative interactions often induce severe anxiety, cognitive overload, and avoidance behavior, thereby worsening digital and social exclusion.

Key findings.
– Qualitative thematic analysis supported that language barriers, fragmented digital ecosystems, opaque system logic, and repetitive disclosure requests transform routine administrative tasks into emotionally distressing experiences.
– Qualitative evidence supported that trauma-related vulnerability amplifies cognitive overload and system avoidance, whereas informal human mediation (e.g., volunteers and community groups) acts as a critical, necessary workaround.
– The research successfully derived seven concrete design principles for trauma-informed systems, which focus on safety over efficiency, predictability, graduated engagement, integrated human support, contextual legibility, data dignity, and ecosystem coherence.
– Limitations include a highly specific sample (Ukrainian adults under Temporary Protection Directive in Finland) and the fact that these design principles have not yet been instantiated or evaluated in real-world environments.
What it means for you
  • CIO / IT Executive: Instruct your enterprise architecture and quality assurance teams to audit current digital portal KPIs, shifting the primary success metrics for refugee-facing systems away from 'transaction completion speed' and 'strict session timeouts' to 'session resumption rates' and 'error-tolerance thresholds' to prioritize cognitive safety over system throughput.
  • IT Manager: Open your development sprint backlog and create a high-priority user story to eliminate all multi-step form timeouts and implement a 'Save draft and pause' button, ensuring anxious or cognitively overloaded users can step away and return to their application without losing their entered data.
  • Business Strategist: Draft a service-level agreement (SLA) and partnership proposal to fund and integrate local refugee community group volunteers directly into your digital service flow, establishing a 'warm handoff' feature where users can instantly route their digital application to a trusted human mediator for co-filling.
  • Researcher: Draft a research protocol and ethics approval request to conduct a comparative usability study that instantiates and evaluates two of the study's design principles ('contextual legibility' and 'graduated engagement') in a live, low-fidelity prototype with a diverse group of non-Ukrainian displaced persons to test for generalizability.
  • Policymaker: Draft a policy amendment for your department's digital procurement guidelines that mandates all future public-sector IT service contracts must prove compliance with 'data dignity' standards, legally prohibiting systems from making repetitive, traumatic identity and disclosure requests across different public agencies.
Information Systems, Design Principles, Trauma-Informed Information Systems, Refugees, eDSR, Digital Divide
Circular Transformation: A Design Science Approach to a Digital Assessment System for the Plastics Industry
DESRIST 2026 · Part II

Circular Transformation: A Design Science Approach to a Digital Assessment System for the Plastics Industry

Stephanos Filippakis, Ulvi Ibrahimli, Jonathan Lambers, Lisa Wolf, Heicke Gaedeke, Ulrich Müller-Steinfahrt and Axel Winkelmann
🎤 Lev Kenning & Lore Koestler
This study develops and evaluates the Circular Transition Assessment Tool (CTAT), a digital self-assessment system designed to help plastics manufacturers evaluate and advance their transition toward a circular economy. Utilizing a design science research approach, the authors formulated meta-requirements and design principles to build a tool that integrates multi-level maturity scoring with automated, expert-validated recommendations. The artifact's utility and usability were iteratively evaluated through qualitative workshops and quantitative surveys with industry professionals.

The problem. Plastics manufacturers face intense regulatory and environmental pressure to transition from linear to circular production models, yet they lack domain-specific tools to accurately assess their readiness. Existing maturity models are often too generic, purely conceptual, or fail to connect with practical organizational decision-making, leading to misallocated resources. Furthermore, there is a lack of research on designing digital assessment tools that balance automated, transparent logic with human expert validation to support credible sustainability transitions.

Key findings.
– Quantitative survey results (N=10) supported that the assessment tool is intuitive, accessible, and highly usable for plastics-industry organizations.
– The evaluation supported the operational feasibility and expected positive decision-making impact of integrating the tool into routine organizational practices.
– The assumption that the tool's standard framework easily captures the highly complex, heterogeneous reality of all real-world circular economy processes was not fully supported, yielding a high number of neutral ratings regarding fidelity.
– The empirical findings are limited by a small sample size (eight companies total across two cycles) and a geographical focus on the German plastics sector, meaning the tool may require adaptation for other regions or industries.
What it means for you
  • CIO / IT Executive: Launch an audit of your company’s PLM and ERP databases to identify the specific data pipelines required to feed a digital circularity assessment tool, ensuring your system architecture can support a hybrid workflow that combines automated data ingestion with manual expert validation overrides for non-standard manufacturing processes.
  • IT Manager: Draft a technical specification sheet for a prototype circularity self-assessment portal that features a modular UI, allowing plant managers to input heterogeneous plastics recycling metrics while embedding a 'flag for expert review' button to handle complex, non-standard circular processes that automated logic cannot accurately score.
  • Business Strategist: Identify your company's top three most complex, customized plastic production lines and schedule a cross-departmental workshop with plant managers and sustainability leads to define custom, localized circularity KPIs that go beyond standard generic maturity models.
  • Researcher: Draft a research proposal to test and adapt the Circular Transition Assessment Tool (CTAT) framework within a non-German context, such as the North American or Asian plastics sector, specifically designing a methodology to resolve the 'fidelity gap' by co-creating custom assessment modules for highly heterogeneous recycling workflows.
  • Policymaker: Draft a regional regulatory proposal that establishes a standardized digital compliance framework for plastics manufacturers, incorporating incentives and grant funding for SMEs that adopt certified, digital circular-transition self-assessments that feature both automated metrics and verified human-expert validation.
Circular Economy, Design Science Research, Plastics Industry, Maturity Model, Decision Support Systems, Self-Assessment
Generative AI in Knowledge Management: Designing a GenAI Chatbot for Tacit Knowledge Externalization
DESRIST 2026 · Part II

Generative AI in Knowledge Management: Designing a GenAI Chatbot for Tacit Knowledge Externalization

Oliver Dinand, Vincent Heimburg and Manuel Wiesche
🎤 Liv Knowles & Leo Kant
This study adopts an echeloned Design Science Research approach to design, instantiate, and evaluate a generative AI chatbot named externalAIze for externalizing tacit knowledge. The artifact is built on four core design principles, phased reflective questioning, curated input with free-text fallback, adaptive motivational feedback, and human-in-the-loop validation, and was evaluated with 53 participants.

The problem. Organizations face a high risk of losing valuable, experience-based tacit knowledge due to workforce mobility, restructuring, and retirement. Traditional human-mediated elicitation methods are expensive and hard to scale, while existing conversational AI options lack prescriptive design guidance and frequently impose high cognitive loads on users.

Key findings.
– The hybrid interaction modality (curated input options with free-text fallback) was statistically supported as being significantly less demanding in terms of articulation effort compared to a fully open-ended text interface, while providing more expressive flexibility than a purely closed-ended interface.
– Phased reflective questioning based on systematic reflection was supported as a meaningful method to help users surface tacit knowledge, though its perceived effectiveness was lower for participants with no prior familiarity with generative AI systems.
– Empathetic and adaptive motivational feedback positively influenced user engagement and reflection, but was not supported as a standalone solution for older participants or those in later career stages who reported higher hindrances to articulation.
– A human-in-the-loop knowledge validation step was supported as an effective mechanism for users to correct errors and misinterpretations, significantly improving the perceived quality of the final explicit knowledge summary.
– Limitations of the study include a relatively small sample size of 53 participants, reliance on subjective self-report questionnaires, and evaluation using a generic, simplified weekend-trip scenario rather than complex, specialized business processes.
What it means for you
  • CIO / IT Executive: Direct your enterprise architecture team to update the design guidelines for internal GenAI tools, mandating that any knowledge-capture interfaces implement a hybrid interaction model (curated options with free-text fallback) and include a mandatory 'human-in-the-loop' validation step before AI-generated summaries are committed to the corporate knowledge base.
  • IT Manager: Review the interface of your current internal chatbot projects and replace open-ended text boxes with curated, clickable prompt options while retaining a free-text fallback, and schedule a hands-on GenAI onboarding session specifically tailored for late-career team members to reduce their articulation barriers.
  • Business Strategist: Identify the top three departments most vulnerable to knowledge loss due to upcoming retirements, and launch a pilot tacit-knowledge capture initiative that pairs retiring experts with a structured, phased reflective questioning process and a peer-validation step to externalize their critical operational expertise.
  • Researcher: Draft a research protocol to test the externalAIze conversational design framework within a highly specialized, complex business domain, such as medical diagnostics or aerospace engineering, using a sample size of over 100 participants to overcome the limitations of the original study's simplified weekend-trip scenario.
  • Policymaker: Establish an organizational data governance policy that mandates a 'human-in-the-loop' validation protocol, prohibiting the automated publishing of GenAI-summarized employee knowledge into the enterprise directory without explicit, documented manual approval from the originating expert.
Knowledge Management, Knowledge Externalization, Generative AI, Design Science Research, Chatbots, Tacit Knowledge
Design Science Research in the Age of Generative AI: A Systematic Literature Review and Research Agenda
DESRIST 2026 · Part II

Design Science Research in the Age of Generative AI: A Systematic Literature Review and Research Agenda

Ransome Bawack and Kevin Carillo
🎤 Lev Kenning & Lore Koestler
This study conducts a PRISMA-guided systematic literature review of 35 empirical articles to analyze how Design Science Research (DSR) designs and evaluates generative artificial intelligence (GenAI) artifacts. The authors propose a configuration-centric framework to assess artifact boundaries, evaluation evidence, knowledge portability, and governance in the GenAI era.

The problem. Generative AI challenges traditional DSR assumptions because its probabilistic nature, prompt sensitivity, and reliance on external components make it difficult to treat artifacts as stable, bounded objects. Consequently, there is a lack of clarity on how to establish rigorous evaluation, cumulative design knowledge, and practical validity for GenAI-enabled systems.

Key findings.
– The analysis of the corpus shows that GenAI artifacts are typically socio-technical configurations rather than standalone tools, with pipeline-style architectures present in 25 out of 35 studies and retrieval augmentation in 20 out of 35 studies.
– A recurring weakness identified in the reviewed literature is that evaluation evidence often outpaces configuration disclosure, with many studies failing to report stable model identifiers, prompt libraries, or retriever settings.
– While 23 out of 35 studies attempt to formulate portable design knowledge (such as design principles or blueprints), these contributions are frequently not configuration-complete, risking brittle prescriptions.
– The study finds that although ethics and governance concerns are widely acknowledged in the analyzed papers, they are rarely operationalized or evaluated as core functional design features.
– Acknowledged limitations of this review include the emergent and heterogeneous nature of the analyzed literature, its concentration in specific archetypes (such as RAG systems and assistants), and the potential exclusion of non-English or grey literature.
What it means for you
  • CIO / IT Executive: Create a mandatory registry for all enterprise GenAI projects that requires teams to document and version-control their specific model IDs, prompt templates, and retriever parameters to manage the risks of brittle, unstable configurations.
  • IT Manager: Meet with your development team to implement Git-based versioning for your RAG pipelines' prompts, model identifiers, and retrieval-augmented settings, ensuring no model updates are pushed without these configurations being locked.
  • Business Strategist: Review the business case for your current GenAI projects and rewrite the project scope to mandate that ethics and governance guardrails are designed and evaluated as core functional requirements, rather than non-functional compliance checkboxes.
  • Researcher: Audit your current design science research paper draft to ensure it is 'configuration-complete' by appending a detailed appendix containing the exact prompt libraries, stable model identifiers, and retriever configurations used.
  • Policymaker: Establish a draft standard for public sector GenAI procurement that requires vendors to submit complete configuration disclosures, including prompt libraries, stable model IDs, and retriever settings, as a prerequisite for security and evaluation audits.
Generative artificial intelligence, Design Science, Systematic Literature Review, Socio-technical configurations, Evidentiary stability, Cumulative portability
Fruitfulness in Design Science: Evaluating Contributions by What They Make Possible
DESRIST 2026 · Part II

Fruitfulness in Design Science: Evaluating Contributions by What They Make Possible

Roland M. Mueller
🎤 Liv Knowles & Leo Kant
This study introduces 'fruitfulness' as a complementary evaluative dimension for Design Science Research (DSR). It develops a conceptual model that shifts the evaluation of design knowledge from immediate problem-solving utility to its capacity to expand future design options.

The problem. Traditional DSR evaluations focus primarily on the immediate 'fitness' of an artifact in solving a specific problem. This orientation offers limited guidance for assessing future-oriented or speculative designs whose primary value lies in opening new trajectories rather than immediate optimization.

Key findings.
– This is a purely conceptual study and does not present empirical testing or statistical hypotheses.\n- The paper proposes a conceptual model of 'fruitfulness' to assess how design knowledge expands future design trajectories within a possibility lattice.\n- The study introduces a taxonomy of four artifact types (configurational, adjacent-exploratory, latent-activating, and boundary-probing) to differentiate how designs support option-making, though these concepts are not empirically validated.
What it means for you
  • CIO / IT Executive: On Monday morning, update your IT project appraisal template to include a 'Fruitfulness' evaluation dimension alongside traditional ROI, allowing your team to justify and fund speculative, 'boundary-probing' pilot projects that open up future architectural options rather than just immediate cost-savings.
  • IT Manager: In your Monday morning team stand-up, introduce the concept of 'latent-activating' designs and challenge your developers to identify one monolithic component in your active sprint that can be refactored into a modular API to expand future system integration options.
  • Business Strategist: On Monday morning, map your product R&D roadmap against the study's four artifact types (configurational, adjacent-exploratory, latent-activating, and boundary-probing) to identify if your pipeline is unsustainably skewed toward immediate 'configurational' tweaks at the expense of long-term option-making.
  • Researcher: On Monday morning, take your current Design Science Research manuscript and restructure your 'Contributions' section to explicitly position your artifact within a 'possibility lattice', detailing how your design expands future research trajectories rather than just solving a static, immediate problem.
  • Policymaker: On Monday morning, draft a new evaluation clause for your technology innovation grant guidelines that rewards applicants who explicitly demonstrate the 'fruitfulness' of their conceptual designs, shifting funding criteria from immediate commercial viability to long-term ecosystem enablement.
Design Science Research, Fruitfulness, Design Evaluation, Adjacent Possible, Anticipatory Design, Future-Oriented Design
Design Principles for Engaging with Desirable Future(s): A Study on Experiencing Entrepreneurial Action to Co-Create Common Future(s) Within Planetary Limits
DESRIST 2026 · Part II

Design Principles for Engaging with Desirable Future(s): A Study on Experiencing Entrepreneurial Action to Co-Create Common Future(s) Within Planetary Limits

Annemarie Bloch
🎤 Lev Kenning & Lore Koestler
This design science study explores how hands-on entrepreneurship education can help individuals feel capable of actively shaping future societal and ecological paths. The researcher developed and evaluated a university course integrating sustainability concepts with entrepreneurial action, analyzing qualitative reflection papers from 20 participating students.

The problem. In an era of complex ecological and social crises, individuals often experience a sense of 'unable-ness' or paralysis, viewing the future as abstract and unchangeable. Conventional educational and organizational frameworks lack actionable design knowledge to help people overcome this perceived lack of agency and collectively build pathways toward desirable futures.

Key findings.
– Qualitative analysis of student reflection papers supported that integrating entrepreneurial action with collaborative future-making practices successfully cultivated a sense of personal 'able-ness' to shape desirable futures, with two individual exceptions where students remained unoptimistic or felt a lack of belonging.
– The qualitative data supported that acquiring knowledge about severe ecological crises does not lead to paralysis when paired with solution-oriented, collaborative entrepreneurial tasks.
– The study qualitatively validated eight design principles (DP1-DP8) emphasizing multi-disciplinary teamwork, value negotiation, systemic thinking, and safe spaces for prototyping as effective frameworks for cultivating future makers.
– Limitations of the findings include a small qualitative sample of 20 higher education students, potential desirability bias in self-reported reflections, and specific dependency on the educational context.
What it means for you
  • CIO / IT Executive: Send an email to invite your Head of Infrastructure and your Head of Sustainability to co-sponsor a 48-hour 'Green IT Sandbox' initiative next month, allocating a small budget and a safe, isolated cloud environment for cross-functional developer teams to prototype cloud-compute carbon reduction solutions.
  • IT Manager: In your Monday morning team sync, introduce a 15-minute 'Value-Negotiation and Prototyping' agenda item where team members can pitch one sustainable, low-code alternative to a resource-heavy legacy system, and dedicate a 4-hour window this Friday for the team to build a rough proof-of-concept without fear of project timeline penalties.
  • Business Strategist: Draft an agenda for next month's quarterly strategy session that replaces passive environmental risk-forecasting with a hands-on 'Planetary Boundaries Venture' workshop, requiring participants to pitch three new circular business models that address a specific ecological crisis while maintaining core business viability.
  • Researcher: Email three corporate innovation heads to pitch a collaborative, mixed-methods study that applies the eight design principles (DP1-DP8) to their internal R&D teams, incorporating objective behavioral metrics (like prototype completion rates) alongside self-reported surveys to mitigate desirability bias.
  • Policymaker: Submit a policy memo to the department head proposing a pilot grant program that conditions local university funding on the integration of collaborative, solution-oriented 'Eco-Entrepreneurship Challenges' into standard climate science curricula, shifting from lecture-only education to active future-making.
Entrepreneurial action, desirable futures, planetary boundaries, design science research, sustainability education, future making
Testbed Research, Practitioners and Researchers as Architects of Digital Ecosystems
DESRIST 2026 · Part II

Testbed Research, Practitioners and Researchers as Architects of Digital Ecosystems

Simon Hiller, Patrick Weber, Maximilian Werling and Heiner Lasi
🎤 Liv Knowles & Leo Kant
This study introduces Testbed Research, a design-oriented methodology adapted from Action Design Research (ADR) to guide the design and management of digital ecosystems. The authors develop a five-stage, process-oriented framework and evaluate its application across 30 real-world ecosystem initiatives conducted between 2016 and 2026.

The problem. Non-digital enterprises frequently struggle to establish and sustain digital ecosystems due to the high complexity of managing heterogeneous partners with conflicting interests. Existing design science research methods lack structured guidelines for coordinating multiple autonomous actors and do not account for the role of researchers as neutral orchestrators or trust anchors.

Key findings.
– This study does not perform statistical hypothesis testing, instead evaluating the framework qualitatively through practical application.
– The five-stage Testbed Research methodology facilitated the creation of digital ecosystem proofs of concept within 12 to 18 months across 30 real-world projects.
– The research team established a neutral 'trust anchor' role, which resolved controversial partner negotiations over risk and reward allocation during ecosystem formation.
– A detailed pay-per-part case study demonstrated that the method aligned a diverse group of partners, including an original equipment manufacturer, a bank, a service provider, and a mechanical engineering company, to realize collaborative value.
– Limitations include that the framework's performance was evaluated within a specific set of 30 initiatives, and the influence of different funding models on stakeholder dynamics remains untested.
What it means for you
  • CIO / IT Executive: Review your active digital partnership pipeline, select one complex multi-party ecosystem initiative, and draft an invitation to a local university or independent research institute to join the steering committee as a neutral 'trust anchor' to mediate partner conflicts and govern data sharing.
  • IT Manager: Take your current multi-partner integration roadmap and restructure the next phases into a 12-to-18-month timeline specifically focused on building a minimal collaborative proof-of-concept (PoC), establishing clear technical stage-gates for partner contributions.
  • Business Strategist: Draft a 'pay-per-use' or 'pay-per-part' business model blueprint for your next multi-enterprise venture, explicitly detailing how costs, revenues, and risks will be split between your firm, a financial partner, and a service provider.
  • Researcher: Email the coordinator of an industry consortium and propose a joint 'Testbed Research' project where your academic team acts as a neutral orchestrator to design, document, and evaluate their ecosystem's governance model over the next year.
  • Policymaker: Draft a new requirement for your department's upcoming digital innovation grant cycle that mandates all funded industry-led consortia must include a neutral research institution as an ecosystem orchestrator to receive funding.
Action Design Research, Design Science Research, Digital Ecosystems, Testbed Research, Design-oriented IS
Fundamental Patterns, A Taxonomy for Archetype Development in Information Systems
DESRIST 2026 · Part II

Fundamental Patterns, A Taxonomy for Archetype Development in Information Systems

Christian Koldewey, Celina Maleen Avermeyer, Hendrik van der Valk, Julian Zerbin and Roman Dumitrescu
🎤 Liv Knowles & Leo Kant
This study analyzes 114 archetype development approaches in the Information Systems domain through a systematic literature review to construct a comprehensive taxonomy. The taxonomy classifies these approaches based on their context, objects of analysis, and development procedures to establish a clearer methodological foundation.

The problem. Although archetypes are widely used to categorize and analyze complex socio-technical phenomena, their construction often lacks methodological rigor and standardization. This fragmentation leads to ad-hoc research configurations, limits transparency, and makes it difficult to compare findings across different studies.

Key findings.
– The study successfully developed and validated a taxonomy comprising 43 characteristics across 11 dimensions to structure archetype development (supported by empirical analysis of 114 papers).
– Analysis showed that 97% of archetype developments are conducted ex-post and 98% are descriptive rather than prescriptive (statistically supported by frequency distribution).
– The most common purpose of archetypes is to serve as a design guide (65%), followed by conceptual maps (42%), assessment instruments (37%), and theory builders (30%) (statistically supported; non-exclusive categories).
– A significant methodological deficit was identified, with 80% of reviewed archetype studies containing no evaluation of their developed archetypes (statistically supported).
– The study's limitations include potential interpretive bias during taxonomy development and restriction of the analyzed corpus exclusively to the Information Systems domain.
What it means for you
  • CIO / IT Executive: On Monday morning, audit your enterprise architecture roadmap and issue a directive requiring that all active IT user personas and system archetypes used for technology procurement must undergo a formal, documented validation workshop with end-users before any design decisions are finalized, correcting the common 'no evaluation' pitfall.
  • IT Manager: On Monday morning, gather your development team and cross-reference your project's target user personas (archetypes) against actual system usage logs to verify that these profiles represent real-world user behaviors rather than unvalidated, ad-hoc assumptions.
  • Business Strategist: On Monday morning, review your organization's business model or customer archetypes and explicitly write out 3-5 action-oriented, prescriptive rules for each profile to shift them from being purely descriptive concepts (which represent 98% of archetypes in practice) into actionable guides for strategic growth.
  • Researcher: On Monday morning, open the methodology section of your active archetype development paper and insert a dedicated, rigorous validation chapter, such as expert delphi panels or statistical cluster analysis, to ensure your study does not fall into the 80% of published research that fails to evaluate its archetypes.
  • Policymaker: On Monday morning, draft an update for your department's digital capability framework requiring that any IT classification schemas or organizational maturity archetypes used to evaluate public sector grants must publicly publish their underlying development methodology and evaluation criteria to prevent ad-hoc, non-standardized policy implementation.
Archetype Development, Design Knowledge, Taxonomy Development, Conceptual Artifact, Theory Building
Authentic Learning by Design: Meta-Requirements for AI Support for Students and Educators
DESRIST 2026 · Part II

Authentic Learning by Design: Meta-Requirements for AI Support for Students and Educators

Anna Wolters, Gregor Kipping, Sofie Wass, Michael Gau, Dennis M. Riehle and Leona Chandra Kruse
🎤 Lev Kenning & Lore Koestler
This study examines how artificial intelligence (AI) systems can be designed and integrated to facilitate authentic learning in higher education. Utilizing an echeloned design science research (eDSR) approach, the authors analyzed 200 AI systems deployed across European institutions and interviewed 11 experienced educators to construct a design problem space. From these insights, they derived and qualitatively validated ten meta-requirements for AI learning support systems.

The problem. The rapid adoption of large language models in education has sparked concerns that students may outsource critical cognitive tasks, leading to deskilling and a lack of authentic learning. Additionally, educators often lack the competencies, time, and structured institutional support needed to effectively integrate these tools into their curricula. This creates a significant gap in prescriptive design guidance for educational AI systems that promote deep pedagogical engagement rather than simple cognitive shortcuts.

Key findings.
– A market review of 200 European educational AI systems revealed that current tools predominantly focus on administration, content-specific learning, or task feedback, while completely neglecting substantive student-to-student conversations.
– The study identified three central qualitative problem dimensions: educators' lack of competencies and resources, students' uncritical reliance on AI outputs, and insufficient institutional adoption support.
– Ten meta-requirements (R1-R10) addressing content, user experience, and governance were developed to guide the design of AI-supported learning platforms.
– Qualitative validation with experts supported the relevance and coherence of the meta-requirements, although educators noted that institutional regulations could restrict the flexibility of proposed assessment templates (R4).
– The study's findings are qualitative and subject to limitations including a geographic focus on European institutions and a small evaluation sample size of 11 initial interviewees, 12 focus group participants, and 3 validation interviewees.
What it means for you
  • CIO / IT Executive: Review the university's active AI software contracts and update the IT procurement guidelines to require that all educational AI tools meet the study's ten meta-requirements, prioritizing systems that foster student-to-student collaboration over simple question-and-answer automation.
  • IT Manager: Create and deploy a 'Socratic Tutor' system prompt template in the institutional AI sandbox, and invite five faculty members to test it in their classes to prevent students from outsourcing their writing assignments.
  • Business Strategist: Draft a product development roadmap for an AI-powered social learning tool that fills the market gap identified in the research by facilitating peer-to-peer study group discussions instead of solo student-to-AI interactions.
  • Researcher: Write a grant proposal to conduct a large-scale, quantitative study validating the ten meta-requirements across 1,000 students and educators in non-European universities to address the original study's geographic and sample size limitations.
  • Policymaker: Draft a policy exception framework for the academic senate that allows departments to bypass rigid assessment regulations, enabling educators to pilot flexible, process-oriented assessment templates (R4) that integrate AI.
Authentic Learning, Design Science Research, Meta-Requirements, AI Learning Support, Pedagogical Design, Higher Education
From Passive Consumption to Creative Engagement: Designing and Evaluating Scaffolded Augmented Reality Authoring
DESRIST 2026 · Part II

From Passive Consumption to Creative Engagement: Designing and Evaluating Scaffolded Augmented Reality Authoring

Dana Hofmann, Kay Hönemann and Manuel Wiesche
🎤 Lev Kenning & Lore Koestler
This study utilizes an Action Design Research approach to iteratively design and evaluate a voxel-based augmented reality (AR) authoring tool. It compares a semi-structured, example-based scaffold against an unstructured baseline to examine how structural guidance affects the creativity and cognitive load of novice users.

The problem. Augmented reality technologies are largely limited to passive content consumption because existing authoring systems require complex programming and 3D modeling skills. Providing unstructured environments to lower these barriers can overwhelm novice users, yet adding guidance risks restricting creative autonomy and causing anchoring bias.

Key findings.
– Statistically Supported: Artifacts created with semi-structured example-based scaffolds were rated by external experts as significantly more creative than those created without scaffolds.\n- Not Statistically Supported: There was no statistically significant difference in participants' subjective perceptions of creativity support between the scaffolded and baseline conditions.\n- Not Statistically Supported: Differences in total cognitive load, germane load, and extraneous load between the scaffolded and baseline groups were not statistically significant.\n- Study Limitations: The findings are limited by a small sample of 36 novice users and a specific voxel-based design paradigm, meaning the results may not generalize to experienced users or alternative AR systems.
What it means for you
  • CIO / IT Executive: Instruct your enterprise architecture and procurement teams to update the evaluation criteria for any new low-code/no-code AR or 3D development platforms, making the presence of modular, example-based 'starter' templates a mandatory requirement for purchase approval.
  • IT Manager: Before launching next week's AR content creation sprint, set up a shared repository containing three distinct, semi-structured voxel or 3D templates for your design team, and direct them to use these scaffolds as starting points rather than starting from a blank canvas.
  • Business Strategist: Review the product development roadmap for your customer-facing AR application and write a product requirement document (PRD) to replace the open-ended 'sandbox' creative mode with a guided, example-driven creation wizard to drive higher-quality user-generated content.
  • Researcher: Draft a new research proposal and ethics protocol to investigate the gap between objective expert-rated creativity and subjective user-perceived creativity in scaffolded environments, planning a mixed-methods study with a target sample of at least 100 expert 3D designers.
  • Policymaker: Revise the funding criteria for the upcoming municipal 'Digital Skills and Emerging Tech' grant program to prioritize educational institutions whose AR/VR training curricula utilize scaffolded, example-based pedagogical tools over unstructured 'free-play' software platforms.
Augmented Reality, Creativity Support, Action Design Research, Example-based Scaffolds, Design Science Research
Designing AI-Based Entrepreneurial Coaching Systems
DESRIST 2026 · Part II

Designing AI-Based Entrepreneurial Coaching Systems

Jonas Liebschner, Daniel Heinz and Gerhard Satzger
🎤 Lev Kenning & Lore Koestler
This study investigates the design requirements for AI-driven entrepreneurial coaching systems using a design science research approach. The researchers conducted semi-structured interviews with startup founders and professional coaches in Germany to understand the limitations of current coaching models and generic AI assistants.

The problem. Human startup coaching is heavily capacity-constrained and episodic, which often results in coaches lacking visibility into founders' day-to-day decisions. Meanwhile, generic LLM-based assistants lack persistent venture context and provide overly agreeable feedback, which risks substituting founder judgment instead of facilitating deep reflection and learning.

Key findings.
– Proposed and empirically supported four core design requirements for AI coaching systems: task-oriented venture progression support, structured venture building, assumption-driven validation, and reflective developmental support.
– Identified the 'scaffolding-substitution' boundary as a critical design tension, showing that AI systems should provide 'developmental friction' via critical questioning rather than immediate, unchecked answers.
– Outlined a hybrid service design model for support organizations where AI manages structured preparation (e.g., business model canvas, experiment planning), leaving scarce human coaching capacity for high-value reasoning under ambiguity.
– Acknowledged study limitations, noting the qualitative sample was situated specifically within German university-embedded incubator environments focusing on early-stage technology ventures.
What it means for you
  • CIO / IT Executive: Review your organization's internal AI application portfolio and mandate that any generative AI tools used for employee training, professional development, or innovation coaching be configured with system prompts that enforce 'developmental friction', prohibiting the AI from generating direct answers and instead requiring it to ask critical, challenging questions.
  • IT Manager: On Monday morning, configure a custom GPT or system prompt template for your internal product development teams that is strictly bounded to 'structured preparation': program it to guide users through completing a Business Model Canvas and planning validation experiments, while explicitly blocking it from making strategic execution decisions for the user.
  • Business Strategist: Redesign the workflow of your corporate accelerator or internal venture-building program to implement a hybrid model: mandate that teams use AI tools to complete the initial structured data prep and hypothesis generation, reducing human-led coaching sessions from weekly 60-minute status updates to bi-weekly 30-minute deep-dives focused strictly on resolving highly ambiguous strategic roadlocks.
  • Researcher: Draft a research proposal and experimental design to test the 'scaffolding-substitution' boundary outside the German university ecosystem, specifically targeting late-stage corporate spin-offs in the US or Asia to measure how 'developmental friction' affects venture survival rates compared to traditional generic LLM support.
  • Policymaker: Draft a new criteria section for regional public startup grant applications that funds local incubators to integrate shared, hybrid AI-human coaching platforms, requiring them to demonstrate how they use AI-driven 'validation templates' to scale their advisor capacity to reach 50% more underserved early-stage founders.
Entrepreneurial coaching, virtual coaching systems, large language models, design requirements, design science research
Integration is Key: Designing AI-Based Teaching Assistants for ERP Education
DESRIST 2026 · Part III

Integration is Key: Designing AI-Based Teaching Assistants for ERP Education

Christopher Gillespie and Hannah Sperling
🎤 Liv Knowles & Leo Kant
This research-in-progress study designs and evaluates an AI-based conversational teaching assistant integrated directly with SAP S/4HANA to support ERP education. Following a design science research approach, the authors developed a hybrid conversational agent utilizing discriminative AI and conducted a preliminary evaluation using pilot user tests and qualitative expert feedback.

The problem. ERP systems are highly complex, making hands-on classroom training resource-intensive as instructors spend substantial time troubleshooting individual student errors. Standalone generative AI tools lack integration with real-time student transaction data and are prone to hallucinations, limiting their ability to provide accurate, context-specific feedback.

Key findings.
– Qualitative feedback from academic and professional instructors supported the proposition that the ERP teaching assistant saves significant troubleshooting time and mental resources.
– Preliminary user testing supported the finding that students with higher digital literacy navigated the conversational assistant more effectively, whereas students with less technology experience struggled to resolve dialogue misunderstandings.
– Structural testing supported the utility of using a discriminative AI model to successfully prevent hallucinations, though users noted conversational rigidity and gaps in theoretical curriculum content as current design limitations.
What it means for you
  • CIO / IT Executive: On Monday morning, halt plans to deploy standalone, generic GenAI chatbots for internal ERP support and instead charter an IT project to build an API-level integration between your live SAP S/4HANA database and a discriminative AI assistant to ensure hallucination-free, transaction-aware troubleshooting.
  • IT Manager: On Monday morning, create and distribute a one-page 'AI Interaction Cheat Sheet' specifically for non-technical ERP users, detailing how to format queries and recover from dialogue misunderstandings when using the rigid discriminative AI assistant.
  • Business Strategist: On Monday morning, draft a resource-reallocation proposal to shift 20% of your ERP user-support budget from human-led troubleshooting to integrated conversational AI development, using the study's findings on reduced expert mental load and troubleshooting time to justify the ROI.
  • Researcher: On Monday morning, write a research proposal for a laboratory experiment testing a dual-layer AI architecture that combines a generative model (to handle theoretical/conceptual questions) with a discriminative model (to handle database-specific error-checking) to eliminate conversational rigidity.
  • Policymaker: On Monday morning, update the institutional ERP training curriculum standards to mandate a baseline 'Conversational AI Literacy' onboarding module, ensuring students with lower technology experience are trained in AI dialogue navigation before beginning hands-on system training.
SAP S/4HANA, ERP education, conversational agent, design science research, discriminative AI, teaching assistant
A DSR-Informed Intervention to Strengthen Self-Determination in Blended Professional Learning
DESRIST 2026 · Part III

A DSR-Informed Intervention to Strengthen Self-Determination in Blended Professional Learning

Pauline Weritz
🎤 Lev Kenning & Lore Koestler
This study investigates how blended learning environments can be designed using a Design Science Research (DSR) approach to support the self-determination of professional adult learners. By combining Self-Determination Theory with literature analysis and learner needs, the study establishes a conceptual design framework and five initial design principles.

The problem. Adult professional learners often struggle with motivation in blended learning environments due to a lack of clear course structure, insufficient conceptual scaffolding, and weak alignment with real-world practice. Existing systems rarely design intentionally for psychological needs like autonomy, competence, and relatedness, resulting in poor learner engagement.

Key findings.
– Five initial design principles were conceptualized to address learning structure, conceptual clarity, active engagement, contextual relevance, and learning flow.
– The proposed design principles and the prototype have not yet been empirically or statistically tested, meaning the theorized motivational benefits remain hypothetical.
– Preliminary qualitative validation was limited to early feedback from only two students, who supported the feasibility of the design direction.
– Diagnostic survey responses from the targeted student group indicated that perceived autonomy was rated the lowest motivational factor, followed by relatedness, competence, and satisfaction, though statistical significance tests for these differences were not reported.
– A key limitation is the early stage of the research, which lacks full implementation, expert evaluation, and large-scale empirical testing.
What it means for you
  • CIO / IT Executive: On Monday morning, initiate a review of the enterprise LMS roadmap to transition away from static content repositories and prioritize platforms that support customizable, self-paced learning pathways to address the critical lack of user autonomy.
  • IT Manager: On Monday morning, configure the active blended learning modules in your LMS to include visual progress maps and modular checklists to provide the structural scaffolding and learning flow adult learners need.
  • Business Strategist: On Monday morning, redesign corporate training completion criteria to require a manager-approved, real-world work project, directly aligning training content with daily operations to solve the issue of weak contextual relevance.
  • Researcher: On Monday morning, draft an empirical study protocol to transition this hypothetical framework into a controlled, large-scale A/B test comparing the motivational outcomes of these five design principles against traditional blended learning methods.
  • Policymaker: On Monday morning, draft a revised professional development standard that mandates all federally or institutionally funded blended learning programs incorporate explicit design elements supporting learner autonomy and relatedness.
Blended Professional Learning, Autonomy, Competence, Relatedness, Design Science Research
The LLM as Adversary: Designing “Dirty Audit” Assessments to Elicit Evaluative Judgement in Education
DESRIST 2026 · Part III

The LLM as Adversary: Designing “Dirty Audit” Assessments to Elicit Evaluative Judgement in Education

Shahper Richter and Patrick Dodd
🎤 Lev Kenning & Lore Koestler
This study details the design and pilot implementation of an LLM-adversarial assessment pattern where students are tasked with critiquing a flawed, AI-generated 'Dirty Audit' to test their evaluative judgment. The research employs a Design Science Research framework to develop a scalable and AI-resilient assessment model in a digital marketing course. To prevent academic misconduct and AI-assisted cheating, the methodology incorporates a written critique scaffold followed by a proposed synchronous oral verification layer.

The problem. The rapid advancement of generative AI allows students to easily produce polished and professional marketing audits, which invalidates traditional artifact-creation assignments. Consequently, educators face a validity crisis because grades no longer reliably measure students' analytical competence or independent understanding. There is a critical research gap in designing scalable assessments that leverage AI as an adversarial tool rather than merely attempting to restrict its use.

Key findings.
– In the first design cycle involving 151 students, the written critique format resulted in grade clustering in the A- to B+ range, demonstrating a ceiling effect that lacked the power to distinguish between independent work and AI-assisted submissions.
– Automated testing using an LLM red-team confirmed that generative AI alone could not reliably detect the subtle, context-specific errors embedded in the 'Dirty Audit' or construct traceable strategic recommendations, supporting the task's adversarial resistance.
– The study identified a limitation in asynchronous written assessments, showing they remain susceptible to AI-assisted completion and are insufficient for validation without synchronous defense.
– To address the ceiling effect and verify authorship, a seven-minute Interactive Oral Assessment (IOA) verification layer was designed for the second cycle, though its empirical efficacy remains to be fully evaluated in future research.
What it means for you
  • CIO / IT Executive: Instruct your LMS integration team to audit the institution's current educational technology stack to identify, license, and integrate tools that support secure, scalable scheduling and recording for 7-minute Interactive Oral Assessments (IOAs) to prepare for the phase-out of written-only grading.
  • IT Manager: Set up an automated 'red-team' testing pipeline using your enterprise LLM API to run a baseline test on your department's top five written assessments, determining which prompts easily yield passing grades and need immediate transition to a 'Dirty Audit' model.
  • Business Strategist: Redesign the corporate training and onboarding curriculum by replacing 'content creation' milestones with 'Dirty Audit' exercises, requiring new hires to critique and correct flawed, AI-generated department reports to test their actual evaluative judgment.
  • Researcher: Draft a controlled empirical research protocol for the upcoming semester to compare grade distributions and AI-detection rates between a student cohort using written-only critiques and a cohort subjected to the 7-minute Interactive Oral Assessment (IOA) verification layer.
  • Policymaker: Draft an update to the institutional Academic Integrity and Assessment Policy that explicitly mandates a synchronous, oral, or interactive verification layer for any high-stakes written assessment that is worth more than 30% of a course's final grade.
Design Science Research, Beyond the Artificial, Assessment Design, Evaluative Judgement, Large Language Models, Interactive Oral Assessment
Improving Human Foraging with Hybrid Semantic Graph Retrieval and LLM-Supported Meaning Making
DESRIST 2026 · Part III

Improving Human Foraging with Hybrid Semantic Graph Retrieval and LLM-Supported Meaning Making

Alexander Meier
🎤 Liv Knowles & Leo Kant
This study evaluates how a hybrid semantic graph-vector retrieval system impacts cognitive cost, user satisfaction, and task performance in retrieving presentation slides. The researcher conducted an online experiment with 56 participants who constructed a slide-based proposal pitch under different retrieval conditions.

The problem. Creating presentations often requires reusing legacy slides from large repositories, which is cognitively demanding and frequently lacks narrative consistency. Traditional search engines and hierarchical folders fail to support users in navigating complex, multimodal content and structuring cohesive stories.

Key findings.
– Hybrid graph-vector-based exploration led to statistically significantly higher user satisfaction compared to traditional hierarchical folder navigation (Supported).
– Hybrid graph-vector-based exploration statistically significantly improved task performance in slide retrieval and creation (Supported).
– Hybrid graph-vector-based exploration did not lead to a statistically significant decrease in cognitive effort compared to traditional navigation (Not Supported).
– Hypotheses regarding the quality of LLM-supported narratives and their mediating effects were not tested in this preliminary, work-in-progress paper, which is also limited by a small sample size (N=56).
What it means for you
  • CIO / IT Executive: Authorize a pilot budget to migrate the company's central sales and pitch deck repositories away from legacy folder-based shared drives (like SharePoint or Google Drive) and into a hybrid graph-vector database system to improve asset reuse and team productivity.
  • IT Manager: Extract a sample of 500 legacy presentation slides from the marketing department, index them using a hybrid graph-vector database (such as Neo4j with vector search capabilities), and configure a basic search interface for a pilot user group to test.
  • Business Strategist: Restructure the proposal-writing workflow to mandate a 'semantic search' phase at the start of pitch creation, training team members to query the repository for thematic slides rather than recreating presentations from scratch.
  • Researcher: Draft a formal research proposal and IRB application to conduct a follow-up study with a larger sample size (N > 150) that explicitly tests the quality, coherence, and cognitive load of LLM-supported narrative transitions in slide deck creation.
  • Policymaker: Draft and distribute an enterprise-wide metadata policy that mandates tagging all corporate presentation slide decks with key relationship and semantic metadata, phasing out the reliance on deep hierarchical folder structures.
Foraging, Hybrid Search, Graph RAG, LLM, Narratives, Information Foraging Theory
Balancing Openness and Security in Digital Energy Ecosystems
DESRIST 2026 · Part III

Balancing Openness and Security in Digital Energy Ecosystems

Theodore Kindong, Gianluigi Viscusi and Björn Johansson
🎤 Lev Kenning & Lore Koestler
This study develops a design science research framework to balance open innovation and information security within digital energy ecosystems. The researchers conceptualize openness and security as interdependent dimensions and apply their proposed Digital Energy Ecosystem (DEE) framework to the case of the Swedish National Data Hub.

The problem. The digitalization of smart grids creates a critical tension between the need for open, collaborative architectures that drive innovation and the necessity of robust cybersecurity to protect critical energy infrastructure. Existing management approaches often treat openness and security as opposing forces rather than co-evolving, interdependent design requirements.

Key findings.
– The study introduces the Digital Energy Ecosystem (DEE) framework, which models system performance as a function of innovation value, security risk, and a governance loop factor; formal statistical hypothesis testing was not conducted in this design science paper.
– Application of the framework to the Swedish Data Hub vignette demonstrated that regulatory delays reduced governance effectiveness (from a target of 1.2 to 0.7) and increased residual risk (from 1.6 to 2.2), explaining why the project was placed on hold.
– The conceptual and qualitative analysis supports the design proposition that systemic viability in digital energy infrastructures depends as much on institutional readiness and regulatory clarity as it does on technical cybersecurity solutions.
What it means for you
  • CIO / IT Executive: Review your organization's active smart grid and data-sharing integration projects to map them on a dual 'Innovation Value vs. Residual Security Risk' matrix, identifying any projects where regulatory or institutional bottlenecks are currently inflating security risks.
  • IT Manager: Conduct an audit of your energy data-sharing APIs and design a prototype for adaptive, multi-tiered access controls that dynamically adjust data-sharing openness based on the connecting partner's real-time security compliance, rather than using a binary allow/block system.
  • Business Strategist: Add a 'regulatory and institutional readiness' risk multiplier to the financial models of your digital energy initiatives to account for potential project delays and governance friction, using the Swedish Data Hub case as a baseline for how delays inflate residual risk.
  • Researcher: Draft a research proposal to empirically test the Digital Energy Ecosystem (DEE) framework's mathematical governance loop factor by gathering quantitative operational data from active European smart grid projects that have successfully balanced open APIs with utility cybersecurity.
  • Policymaker: Draft a policy proposal to launch a dedicated 'Energy Data Sandbox' that provides energy startups with a pre-approved, secure test environment with a guaranteed 14-day regulatory response SLA to eliminate the governance bottlenecks that stall national infrastructure projects.
Digital energy ecosystems, Energy Management Systems, Openness, Information Security, Design Science Research, Adaptive Governance
Reuse in Design Science Research: The Example of a Distributed Ledger Technology Design Knowledge Library
DESRIST 2026 · Part III

Reuse in Design Science Research: The Example of a Distributed Ledger Technology Design Knowledge Library

Max Gräser and Rainer Alt
🎤 Lev Kenning & Lore Koestler
This research-in-progress paper proposes the conceptualization and development of a design knowledge library for distributed ledger technology (DLT) to enable the systematic reuse of design knowledge in future projects. The research design outlines a multi-phase approach involving a systematic literature review, expert interviews, and the implementation of two e-commerce prototypes.

The problem. Design science research (DSR) knowledge is rarely systematically accumulated or reused across projects, leading to inefficient and fragmented development processes. In the context of DLT, practitioners lack concrete, structured guidelines to bridge the gap between high-level technology adoption decisions and actual system implementation features.

Key findings.
– Identified initial design elements consisting of five requirement categories, nine design principles, and twelve implementation features from a preliminary review of seven DSR publications (Supported qualitatively by the initial literature review).
– Mapped complex relationships (n:m) between design requirements, such as preventing data manipulation, and corresponding principles and features (Supported qualitatively by the initial literature review).
– Outlined a structured four-phase development and evaluation plan for the library, including future validation through two e-commerce prototypes (Proposed framework, empirical validation remains for future work).
– Recognized limitations, as this is a research-in-progress study based on a small preliminary literature sample without full empirical validation or statistical testing.
What it means for you
  • CIO / IT Executive: Direct your enterprise architecture team to audit current DLT and blockchain initiatives and establish an internal 'design knowledge repository' that maps high-level business requirements, such as data manipulation prevention, directly to reusable technical design principles, preventing development teams from reinventing the wheel.
  • IT Manager: Convene a morning meeting with your DLT development team to map your current project's security and operational requirements to a concrete matrix of nine design principles and twelve implementation features, ensuring every technical feature is explicitly linked back to a high-level requirement.
  • Business Strategist: Review the business cases for upcoming e-commerce or digital transaction initiatives and categorize their core business needs into five structured requirement areas to determine if DLT is the appropriate technological solution before committing budget to pilot phases.
  • Researcher: Review the paper's preliminary taxonomy of five requirements, nine principles, and twelve features, and draft a protocol to expand the literature sample beyond the initial seven DSR publications to systematically validate and expand the DLT design knowledge library.
  • Policymaker: Initiate a draft for a standardized technology governance framework for public sector DLT implementations, utilizing a structured mapping of compliance requirements to concrete system features to ensure public blockchain projects are secure and auditable.
Design knowledge reuse, design knowledge library, blockchain, DLT
Designing and Building Personalized Agentic AI for Job Seekers
DESRIST 2026 · Part III

Designing and Building Personalized Agentic AI for Job Seekers

Matthew Mullarkey and Denis Edwards
🎤 Lev Kenning & Lore Koestler
This study presents the design, development, and simulation-based evaluation of a multi-agent AI system built to support job seekers across the entire career search lifecycle. Using the elaborated Action Design Research (eADR) method, the researchers developed a prototype featuring ten specialized LLM-powered agents managed by a supervisor coordinator and human-in-the-loop oversight. The system's functionality was evaluated using simulations with real candidate profiles and curated job postings.

The problem. While employers have rapidly adopted AI-driven systems to automate recruitment, candidates face a significant technological deficit, leading to high cognitive exhaustion and application abandonment. Existing tools only address isolated tasks like resume building rather than providing sustained, multi-step orchestration across the weeks-long job application lifecycle. This study addresses the need for a comprehensive, candidate-facing agentic system to balance this asymmetry.

Key findings.
– The prototype successfully sequenced and orchestrated all ten specialized agents to produce tailored outputs like customized resumes, cover letters, and company-specific interview prep materials (supported through simulation).
– The governance engine correctly enforced rate limits and personally identifiable information (PII) filtering while pausing for human-in-the-loop approval on high-risk actions such as submitting applications (supported through simulation).
– The study successfully validated all three proposed design principles regarding multi-agent specialization, hybrid human-in-the-loop checkpoints, and coordinator-led sequencing (supported through expert review and simulation).
– Limitations include the use of simulated, curated job postings rather than live APIs, evaluation on a small set of candidate profiles, and untested human-in-the-loop thresholds across broader demographics.
What it means for you
  • CIO / IT Executive: Draft a technical directive mandating that all multi-agent AI systems deployed in your enterprise incorporate a centralized governance engine that enforces automated PII filtering and hard human-in-the-loop (HITL) approval gates before executing high-risk external actions.
  • IT Manager: Refactor your team's current monolithic LLM prototype into a multi-agent architecture using a supervisor-coordinator framework (such as LangGraph or AutoGen) to separate tasks like content generation, security checking, and execution into specialized agents.
  • Business Strategist: Allocate budget to develop a premium end-to-end career services feature that packages resume tailoring, cover letter generation, and mock interview prep into a single automated workflow, targeting the market gap left by competitors who only offer single-point resume tools.
  • Researcher: Draft a research proposal to integrate a live job search API (such as Indeed or LinkedIn) into the existing prototype to replace the simulated datasets, and design a testing protocol to measure user cognitive load and trust across diverse candidate demographics.
  • Policymaker: Convene a working group to draft guidelines on 'Equitable AI in Recruitment', establishing standards for candidate-facing agentic tools and ensuring automated corporate hiring systems do not block or penalize candidates who use authorized AI application assistants.
Agentic AI, Multi-Agent Systems, Job Search, eADR, LLM
An Agentic Workflow for Patient Medical History Summarization and Visualization
DESRIST 2026 · Part III

An Agentic Workflow for Patient Medical History Summarization and Visualization

Jennifer Xu and Tamara Babaian
🎤 Liv Knowles & Leo Kant
This research-in-progress study explores the design of a hybrid agentic workflow that integrates large language model (LLM) agents with traditional software components. The system combines structured Electronic Health Records (EHR) data and unstructured clinical notes to generate a consolidated summary and interactive visualization of a patient's life-long medical history.

The problem. Medical practitioners face significant time constraints when trying to review comprehensive, lifetime patient histories scattered across multiple structured databases and unstructured text documents. Current EHR systems are not capable of automatically consolidating and summarizing these heterogeneous data types into an intuitive, chronological overview.

Key findings.
– Statistical significance testing was not conducted in this research-in-progress study.
– The feasibility of the hybrid system architecture, which successfully divides tasks between LLM agents (for summarization) and traditional code (for data retrieval and visualization), was demonstrated through a proof-of-concept prototype.
– A set of preliminary design principles was derived to guide the development of clinical agentic systems, focusing on controlling LLM output variation and minimizing hallucinations.
– The current study is limited by the use of synthetic EHR data from a small sample of 14 simulated patients and the lack of formal clinical testing with medical professionals.
What it means for you
  • CIO / IT Executive: On Monday morning, review your organization's active Generative AI clinical pilots and mandate a hybrid architecture policy: restrict LLMs to processing unstructured text notes, and enforce the use of traditional, deterministic APIs/SQL queries for retrieving structured EHR data to prevent hallucinations.
  • IT Manager: On Monday morning, set up a Python-based sandbox environment using LangChain or Semantic Kernel to build a quick proof-of-concept pipeline that extracts a chronological patient timeline by combining synthetic EHR database rows with unstructured clinical notes.
  • Business Strategist: On Monday morning, schedule a meeting with the head of a high-complexity department (such as Oncology or Geriatrics) to baseline the average minutes clinicians spend manually reviewing historical patient charts, establishing a clear metric to measure future ROI of an automated visualization tool.
  • Researcher: On Monday morning, draft a study protocol to transition from synthetic patient data to a real-world de-identified dataset, and outline a Likert-scale usability survey to gather feedback from 10 clinical professionals on the prototype's chronological visualization design.
  • Policymaker: On Monday morning, draft an internal clinical AI governance guideline requiring that all LLM-generated medical history summaries display a side-by-side verification link to the original source EHR documents, ensuring clinicians can easily audit the agent's output.
Summarization, Large Language Models, Agentic Workflow, Electronic Health Records, Visualization
Dynamic Design Thinking, Building Innovation into the DSR Process
DESRIST 2026 · Part III

Dynamic Design Thinking, Building Innovation into the DSR Process

Frederick Johnson and Frederick K. Johnson
🎤 Lev Kenning & Lore Koestler
This study introduces a hybrid Design Science Research (DSR) framework that integrates Vijay Kumar's seven modes of innovation into Peffers et al.'s six-step process model. The researchers designed and evaluated this framework, alongside an AI-assisted decision-support prototype called DSR.Navigator, to guide method selection during DSR cycles.

The problem. While existing DSR models offer structural guidance, they lack curated, stage-aligned method options for researchers. This gap creates high cognitive load and forces researchers to rely on ad hoc or implicit method choices, which undermines both creativity and transparency.

Key findings.
– Preliminary expert evaluation supported that the revised round-based prototype is clearer and more usable than the original framework.
– Analysis of prior DSR cycles supported that structured integration of innovation modes reduces method-selection friction and enhances design logic transparency.
– The study is limited by its preliminary nature as a research-in-progress, with a formal evaluation of causal explanation and boundary conditions planned as the next step.
What it means for you
  • CIO / IT Executive: Instruct your Enterprise Architecture and IT R&D teams to replace ad-hoc software design processes by formally adopting a hybrid framework that maps Vijay Kumar’s seven modes of innovation directly onto Peffers’ six-step Design Science Research model for all upcoming high-risk IT modernization projects.
  • IT Manager: At your Monday morning team standup, introduce a mandatory 'Method Selection' step for the current sprint; require developers to explicitly map their chosen system-design methods (e.g., user journey mapping vs. rapid prototyping) against Kumar's innovation modes to reduce cognitive load and document design decisions.
  • Business Strategist: Create a standardized 'Design Logic' template for the digital product pipeline that requires project leads to justify their strategic innovation methods using a structured matrix, ensuring that early-stage market and product assumptions are backed by rigorous, transparent design logic.
  • Researcher: Review the methodology chapter of your current research-in-progress and map your design activities against a hybrid matrix of Peffers’ DSR steps and Kumar’s seven innovation modes to eliminate arbitrary method choices and increase the logical transparency of your artifact's evaluation.
  • Policymaker: Draft and insert a new requirement into the organization's technology procurement and grant funding guidelines mandating that applicants submit a structured 'Design Method Map' showing how their R&D steps systematically align with proven innovation frameworks rather than relying on ad-hoc development cycles.
Design science research, innovation integration, method selection, hybrid DSR framework, DSR process enhancement, design methodology
Towards a Multi-agent LLM-Based Tutoring Tool for Mathematical Argumentation Skills
DESRIST 2026 · Part III

Towards a Multi-agent LLM-Based Tutoring Tool for Mathematical Argumentation Skills

Huda Koulani, Lucia Marchionne, Hendrikje Schmidtpott-Schulz, Andreas Eichler, Andreas Bley and Matthias Söllner
🎤 Liv Knowles & Leo Kant
This study adopts a Design Science Research approach to design and build a prototype of a multi-agent, Large Language Model-based tutoring tool named ProofTutor to support university students' mathematical argumentation skills. The researchers consolidated requirements from nine semi-structured user interviews and a review of educational technology literature to construct their system. The resulting tool is designed to deliver formative feedback on mathematical proofs and offer conversational, step-by-step tutorial guidance.

The problem. Undergraduate students often struggle to construct coherent, logically valid mathematical proofs and express their reasoning in precise mathematical language. While personalized feedback is vital for overcoming these challenges, instructors in large lecture settings face significant resource constraints that limit opportunities for individualized guidance. Existing automated tutoring tools frequently lack the capability to provide nuanced feedback on the logical structure and coherence of proof-writing.

Key findings.
– Based on nine semi-structured student interviews, the study successfully identified and consolidated six core user requirements, including the need for individualized feedback, multi-modal interaction, and step-by-step explanations.
– The researchers derived and instantiated four design principles into a web-based multi-agent system prototype called ProofTutor, which utilizes specialized language models to separate tasks such as draft analysis, logical error identification, and feedback synthesis.
– Because this is a Research-in-Progress paper, there are no quantitative statistical evaluations or empirical efficacy results regarding the tool's impact on student learning at this stage, representing a key limitation that the authors plan to address in future evaluations.
What it means for you
  • CIO / IT Executive: Direct your enterprise architecture team to evaluate multi-agent LLM orchestration frameworks (such as LangGraph or AutoGen) on Monday morning to assess how splitting complex tasks, like separating draft analysis from feedback synthesis, can be applied to upgrade the institution's existing monolithic student-facing chatbots.
  • IT Manager: Set up a technical sandbox environment on Monday morning and deploy a simple two-agent proof-of-concept using open-source LLMs (e.g., Llama-3 via Ollama), where Agent A checks basic syntax errors in a text and Agent B drafts a polite, encouraging email response to the user, to measure latency and API overhead of multi-agent architectures.
  • Business Strategist: Initiate a cost-benefit analysis on Monday morning comparing the operational API costs of running a multi-agent AI tutoring system against the recruitment, training, and retention costs of human Graduate Teaching Assistants (GTAs) for undergraduate STEM courses with enrollments exceeding 150 students.
  • Researcher: Draft an Institutional Review Board (IRB) research protocol on Monday morning to design a controlled A/B evaluation study that compares the proof-writing performance of 50 undergraduate math students using the multi-agent ProofTutor tool against a control group of 50 students using a standard, single-prompt ChatGPT interface.
  • Policymaker: Draft an interim institutional guideline on Monday morning outlining the acceptable use of LLMs in academic coursework, explicitly mandating that AI-driven tutoring tools like ProofTutor must only be utilized for formative guidance and self-study, and are strictly prohibited from being used for summative grading or high-stakes assessments.
Mathematical Argumentation, Multi-agent LLM System, Tutorial Guidance, Design Science Research, Formative Feedback
AI-Enabled Automation of High-Risk Decision-Making Processes
DESRIST 2026 · Part III

AI-Enabled Automation of High-Risk Decision-Making Processes

Pascal Nemecek, Tobias Zimmermann, Sandro Franzoi and Jan Vom Brocke
🎤 Liv Knowles & Leo Kant
This research-in-progress study employs an Echeloned Design Science Research (eDSR) approach to establish a design framework for AI-enabled high-risk decision-making systems. Collaborating with a large German financial institution, the researchers conduct semi-structured interviews and a think-aloud study with industry experts to define and validate design goals. The methodology focuses on developing early-stage design objectives and requirements that balance automated technical capabilities with strict regulatory compliance.

The problem. Organizations aiming to automate high-risk operational processes, such as loan approvals, face strict regulatory constraints under frameworks like the EU AI Act and GDPR. These regulations mandate rigorous human oversight, traceability, explainability, and personal data protection, which are difficult to implement using typical black-box AI models. Currently, there is a lack of comprehensive sociotechnical design knowledge to help organizations structure systems that optimize process efficiency while fully complying with these laws.

Key findings.
– The qualitative relevance and completeness of seven design objectives (DO1-DO7) and their corresponding requirements (DR1.1-DR7.1) were validated by domain experts.
– Regulatory, technical, and organizational challenges, specifically regarding data privacy, explainable AI, and human reviewer cognitive overload, were confirmed as major development hurdles (Supported qualitatively).
– Proposed economic challenges were not supported as active adoption barriers, as practitioners viewed automated decision-making as a necessary strategic investment rather than a cost-prohibitive expense (Not supported).
– A fundamental feasibility trade-off was validated between rigorous human oversight (which increases compliance) and process automation (which increases efficiency).
– Limitations: The findings are based on an early-stage, in-progress research project within a single German financial institution and lack summative or quantitative evaluation.
What it means for you
  • CIO / IT Executive: On Monday morning, audit your current high-risk automated decision-making pipelines (such as credit scoring) to identify where black-box models are used, and mandate that your architecture team draft a plan to integrate Explainable AI (XAI) visualization dashboards to prevent cognitive overload for your human compliance reviewers.
  • IT Manager: On Monday morning, set up a technical sprint planning meeting to translate the EU AI Act's traceability and human-oversight mandates into concrete, non-functional engineering requirements (DR1.1-DR7.1) in your product backlog for any high-risk AI models currently in development.
  • Business Strategist: On Monday morning, revise the financial and ROI models for our AI automation projects to remove the expectation of '100% automated efficiency,' and instead build in strategic resource buffers for 'human-in-the-loop' compliance checkpoints, treating regulatory alignment as a strategic investment rather than a cost barrier.
  • Researcher: On Monday morning, draft a research proposal for a quantitative survey or multi-case study targeting financial institutions across France and Italy to statistically validate the generalizability of the 7 design objectives (DO1-DO7) and address the limitation of the current single-institution German study.
  • Policymaker: On Monday morning, direct your regulatory task force to design a 'Regulatory Sandbox' framework specifically tailored for high-risk financial AI, providing firms with practical guidance on how to mathematically and operationally balance the trade-off between strict EU AI Act human-oversight rules and technical process efficiency.
Intelligent Process Automation, Design Science Research, High-Risk Decision-Making, Artificial Intelligence, EU AI Act, Human Oversight
Making Informal Work Visible: Multimodal AI-Enabled Process Discovery in Public-Sector Meetings
DESRIST 2026 · Part III

Making Informal Work Visible: Multimodal AI-Enabled Process Discovery in Public-Sector Meetings

Alina Hafner, Mohamed Aziz Ketata and Holger Wittges
🎤 Liv Knowles & Leo Kant
This study details the design, implementation, and preliminary evaluation of the Meeting Process Twin, an AI-enabled artifact designed to discover and analyze meeting-based decision processes. The tool extracts task-level events from multimodal inputs, such as audio transcripts and visual interactions, to map actual meeting progressions and compare them against formal, agenda-based models.

The problem. A significant gap exists in business process management because many crucial organizational decisions are made during meetings that do not generate structured digital event logs. This lack of visibility makes it difficult to systematically observe and analyze informal practices and deviations from formal agendas, which can create accountability and compliance challenges in governance-sensitive sectors.

Key findings.
– The technical feasibility of the Meeting Process Twin was verified across 54 city council meetings, revealing a low mean deduplicated process fitness of 0.314 and a shadow activity prevalence of 41.9% (supported by sensitivity analysis across similarity thresholds from t = 0.15 to t = 0.60).
– Of the detected shadow activities, 95.7% were statistically classified as innovative practices rather than compliance failures, highlighting regular pre-agenda and off-agenda discussion patterns.
– Limitations include that the current iteration is restricted to single-camera recordings and has only undergone preliminary feasibility testing, with full accuracy validation against human-annotated ground truths still in progress.
What it means for you
  • CIO / IT Executive: Authorize a pilot budget to integrate a multimodal AI meeting capture tool into your organization's highest-stakes governance committee meetings, converting unstructured audio and visual interactions into structured event logs to capture undocumented decision-making.
  • IT Manager: Conduct a technical audit of your primary boardrooms to ensure the existing single-camera and microphone hardware can output the synchronized, high-resolution audio and video streams required to feed a Meeting Process Twin pipeline.
  • Business Strategist: Update your department's standard process-mapping templates to officially include a 'pre-agenda brainstorming' phase, restructuring workflows to accommodate and formalize the highly innovative shadow practices that occur off-agenda.
  • Researcher: Draft a coding protocol to recruit human annotators to label event transitions in 10 archived meeting recordings, establishing a ground-truth dataset to validate the event-extraction accuracy of the Meeting Process Twin.
  • Policymaker: Draft a policy amendment for public-sector meeting guidelines that introduces a formal 'collaborative sandbox' agenda item, providing a transparent, compliant space for the 41.9% of meeting time currently spent on undocumented off-agenda discussions.
Business Process Management, Design Science Research, Multimodal Process Mining, Informal Work, Meeting Process Twin, Shadow Processes
Designing Open Access Platforms for Multilingual Language Awareness in Early Childhood Education
DESRIST 2026 · Part III

Designing Open Access Platforms for Multilingual Language Awareness in Early Childhood Education

Mara Burger, Niklas Kloth and Christina Vom Brocke
🎤 Lev Kenning & Lore Koestler
This study details the iterative design and implementation of an open-access web platform designed to foster multilingual language awareness in early childhood education. Using an echeloned Design Science Research approach, the researchers developed a functional prototype centered on multimodal, self-paced storytelling. The prototype was evaluated qualitatively through interviews and a focus group with educational experts, parents, and software developers.

The problem. Although linguistic diversity is common in early classrooms, digital educational resources are often strictly monolingual or lack pedagogically appropriate support for educators. Existing tools frequently encourage isolated child-device interaction rather than interpersonal engagement. Consequently, educators lack adaptable, open-access tools designed specifically for collaborative multilingual learning in resource-constrained environments.

Key findings.
– The evaluation relied entirely on qualitative feedback from a small sample (8 participants), meaning no statistical hypothesis testing was conducted.
– Evaluators qualitatively supported the platform's usefulness, noting that the simple, minimalist layout effectively reduced visual clutter and helped keep children focused.
– The manual, stepwise story progression was qualitatively supported as a feature that encourages active, discussion-based engagement rather than passive media consumption.
– The design concept of a single hybrid interface for all users was not supported by evaluators, who recommended separate, role-specific entry points for parents, educators, and children to resolve user confusion.
– Study limitations include the preliminary nature of the evaluation, the small sample size of eight participants, and the lack of validation in real-world kindergarten environments.
What it means for you
  • CIO / IT Executive: On Monday morning, instruct your software architecture team to immediately halt any development of unified 'one-size-fits-all' user portals for your educational platforms, and mandate the creation of three distinct, role-specific entry points and interfaces tailored specifically for parents, educators, and children to resolve user confusion.
  • IT Manager: On Monday morning, audit your early childhood learning software's UI backlog to remove automated media auto-play features, replacing them with manual, stepwise 'next' navigation buttons, and strip out non-essential decorative widgets to establish a minimalist layout that prevents cognitive overload in children.
  • Business Strategist: On Monday morning, update your product portfolio roadmap to prioritize 'collaborative, non-isolated multilingual tools' as a key market differentiator, targeting early childhood educators who currently reject passive, monolingual screen-time applications in favor of discussion-based co-use tools.
  • Researcher: On Monday morning, draft a research protocol and partner agreement to deploy the open-access multilingual prototype in a sample of ten local kindergarten classrooms for a four-week field study, transitioning from the initial qualitative evaluation of 8 participants to quantitative, real-world statistical hypothesis testing.
  • Policymaker: On Monday morning, draft a pilot grant program or procurement guideline that prioritizes public funding for early childhood educational technologies that are open-access, support multilingualism, and utilize manual interactive pacing rather than passive, isolated child-device interaction.
Early Childhood Education and Care, ECEC, Educational Technology, Design Science Research, DSR, Multilingualism, Language Awareness
Designing a Socio-Technical Support System for Reuse in Low-Code Development: An Echeloned Design Science Research Study
DESRIST 2026 · Part III

Designing a Socio-Technical Support System for Reuse in Low-Code Development: An Echeloned Design Science Research Study

Marlon Kampmann, Peter A. François, Ralf Plattfaut and André Coners
🎤 Liv Knowles & Leo Kant
This study employs an echeloned Design Science Research (eDSR) approach to develop design knowledge for socio-technical systems that support sustainable reuse in low-code development. The methodology integrates a structured literature review of 77 papers with 20 semi-structured expert interviews across various industries. The study aims to construct and validate a foundational framework of design objectives and requirements for low-code reuse.

The problem. Organizations frequently struggle to scale low-code development initiatives because developers consistently recreate existing functionalities instead of reusing them. This issue leads to redundant code, higher maintenance costs, and decreased infrastructure flexibility. The root of the problem lies in the lack of organizational, individual, and procedural support mechanisms rather than limitations in the technical capabilities of low-code platforms.

Key findings.
– A validated problem statement establishing that effective low-code reuse goes beyond technical enablement and requires organizational, individual, and procedural support (supported qualitatively through literature and 20 expert interviews; no statistical hypothesis testing was conducted).
– Five validated design objectives, Systematic Reuse, Multi-Level Reuse, Cognitive Relief, Organizational Alignment, and Inclusiveness, designed to systematically foster low-code reuse (supported qualitatively; no statistical testing was conducted).
– Six specific design requirements, including ensuring platforms are usable for both professional and citizen developers and making reusable artifacts explicitly identifiable (supported qualitatively; no statistical testing was conducted).
What it means for you
  • CIO / IT Executive: Establish a formal charter for a Low-Code Center of Excellence (CoE) and mandate that 15% of the IT budget for new low-code applications be reallocated toward cataloging, certifying, and maintaining reusable components rather than building from scratch.
  • IT Manager: Create a pinned 'Verified Reusable Assets' catalog in your low-code platform and implement a mandatory 15-minute 'reuse check' step in your team's sprint planning to search this catalog before any new feature development begins.
  • Business Strategist: Update the digital initiative intake form for business units to require citizen developers to identify and document at least two existing low-code templates or modules they will reuse before their project can be approved for launch.
  • Researcher: Draft a mixed-methods research protocol to quantitatively test the 'Cognitive Relief' objective by measuring the time-to-delivery and task load index (NASA-TLX) of citizen developers using search-optimized vs. unorganized low-code repositories.
  • Policymaker: Incorporate a 'Socio-Technical Reuse' standard into the agency's procurement guidelines, requiring any purchased low-code platform to feature an open-access registry accessible by both professional and non-technical staff.
Low-Code Development, Reuse, Design Science Research, Socio-Technical Systems, Citizen Developers, Process Automation
Nascent Design Principles for Human-Machine Interfaces in Airborne Manned-Unmanned Teaming
DESRIST 2026 · Part III

Nascent Design Principles for Human-Machine Interfaces in Airborne Manned-Unmanned Teaming

Jan-Paul Huttner and Dominik Siemon
🎤 Liv Knowles & Leo Kant
This research-in-progress study uses an echelon design science research framework to systematically synthesize findings from 10 empirical studies on fixed-wing manned-unmanned teaming (MUM-T). The authors propose four theory-grounded nascent design principles to establish an initial prescriptive knowledge base for designing human-machine interfaces (HMIs). This foundational work serves as a starting point for future interface instantiation and evaluation with operational pilots.

The problem. Although manned-unmanned teaming is increasingly deployed in civilian and military aviation, the design of human-machine interfaces remains theoretically fragmented and specific to individual systems. Existing cockpit standards and policies lack prescriptive guidelines for managing cognitive challenges unique to airborne multi-platform supervision under severe time constraints. Consequently, designers lack generalized, theory-justified principles to build interfaces that effectively support human-autonomy collaboration.

Key findings.
– Proposed four nascent design principles (Context-Adaptive Automation Intervention, Hierarchical Delegation Granularity, Integrated Temporal-Spatial Plan Visualization, and Bidirectional Human-Autonomy Negotiation) synthesized from prior literature (Not yet empirically evaluated as a combined system in this research-in-progress).
– Justified the mechanisms of the proposed design principles using established kernel theories from cognitive engineering and human-automation interaction (Theoretically supported).
– Noted study limitations, including a small synthesis sample of 10 studies, a focus exclusively on fixed-wing platforms, and the exclusion of classified military data, which may restrict the generalizability of the findings.
What it means for you
  • CIO / IT Executive: Review the technical requirements for upcoming manned-unmanned teaming (MUM-T) software procurements and mandate that vendors align their HMI architectures with the four proposed design principles, specifically Context-Adaptive Automation and Bidirectional Negotiation, prior to signing off on system integration.
  • IT Manager: Gather your HMI design and software engineering teams for a sprint planning session to map current cockpit interface prototypes against the 'Hierarchical Delegation Granularity' and 'Integrated Temporal-Spatial Plan Visualization' principles to identify and resolve potential user cognitive overload points.
  • Business Strategist: Conduct a portfolio analysis of your company's civilian and military aviation offerings to identify where integrating these four theoretical HMI principles can serve as a market differentiator, focusing on safety selling points for multi-platform supervision under extreme time constraints.
  • Researcher: Draft a research proposal and design a flight-simulator experiment to empirically test and evaluate the four proposed HMI principles as a combined system with operational pilots, specifically expanding the scope to rotary-wing platforms to address the study's generalizability limitations.
  • Policymaker: Initiate a working group within your regulatory agency to draft updated aviation certification standards that transition from static interface guidelines to dynamic standards incorporating 'Context-Adaptive Automation' and 'Bidirectional Human-Autonomy Negotiation' for multi-platform supervision.
Manned-Unmanned Teaming, Human-Machine Interface, Design Principles, Design Science Research, Airborne Operations
When Digital Tools Enter the Playground: Designing Information Systems for IT-Distant Care Contexts
DESRIST 2026 · Part III

When Digital Tools Enter the Playground: Designing Information Systems for IT-Distant Care Contexts

Niklas Korte and Florian Lüttgenau
🎤 Liv Knowles & Leo Kant
This study investigates how digital information systems can be designed for and integrated into IT-distant, care-oriented environments. Using a multi-year, interpretive Design Science Research approach, the researchers developed and evaluated a socio-technical ecosystem in a German all-day school over four iterative design cycles. The evaluation combined qualitative methods including ethnographic field notes, interviews, and workshops.

The problem. Traditional information systems design assumes office-like settings characterized by standardized processes, stable user roles, and uninterrupted interactions. However, care-oriented and educational settings rely on relational, situational work and highly heterogeneous user groups. This mismatch frequently causes digital systems to be resisted, bypassed, or reduced to administrative burdens in practice.

Key findings.
– Stakeholder heterogeneity across staff, administration, children, and parents introduces divergent digital literacies and accountability demands (empirically supported by qualitative field observations).
– Care-oriented work demands constant attention shifts, making brief, interruptible digital interactions a primary design requirement over sustained, screen-centered use (empirically supported by qualitative field observations).
– System viability in care contexts depends heavily on multi-stakeholder emotional alignment and trust, where parental privacy and surveillance concerns are as critical as functional requirements (empirically supported by qualitative field observations).
– The study is limited by its focus on a single German after-school care facility and a specific hybrid terminal and NFC-wristband setup, which may restrict generalizability to purely software-based systems or other care environments.
What it means for you
  • CIO / IT Executive: On Monday morning, institute a new architectural policy for care-context software procurement that mandates a 'maximum 5-second interaction window' (e.g., quick NFC taps or single-button inputs) to prevent digital tools from distracting frontline staff from physical caregiving duties.
  • IT Manager: On Monday morning, shadow a frontline educator or caregiver for one hour during their peak care shift to document where they are currently bypassing the digital system or getting distracted by screens, and use these field notes to map out interface simplification steps.
  • Business Strategist: On Monday morning, audit your care-product roadmap and draft a 'Trust and Privacy' positioning document that reframes your product's security architecture, specifically addressing parental surveillance anxieties by emphasizing local data minimization over continuous cloud-based tracking.
  • Researcher: On Monday morning, write a proposal to replicate this multi-stakeholder design science framework in a purely software-based care setting (such as a mobile-only app in adult day-care) to test if the 'interruptibility' and trust requirements can be successfully met without hybrid physical terminals and NFC wristbands.
  • Policymaker: On Monday morning, initiate a draft for an updated IT procurement standard for public care and educational institutions, requiring that any digital tool seeking public funding must pass a 'cognitive load and interruptibility' audit to ensure it does not detract from relational care work.
IT-distant context, Heterogeneous user, In-situ evaluation, Socio-technical ecosystem, Design Science Research, Care-oriented environments
Leading Through Change: Designing an AI Companion for Decision Support from Continuous Conversation with Strategic Decision Makers
DESRIST 2026 · Part III

Leading Through Change: Designing an AI Companion for Decision Support from Continuous Conversation with Strategic Decision Makers

Jan Vom Brocke, Bianca van Dellen, Zoe Zoepffel, Tobias Zimmermann and Timo Strohmann
🎤 Lev Kenning & Lore Koestler
This study describes the design and initial evaluation of an AI Strategy Companion developed to assist senior executives with strategic decision-making. Using an echeloned design science research framework, the authors built a prototype that structures and provides access to peer-derived, real-world leadership experiences.

The problem. Strategic decision-makers currently lack tailored decision support when navigating continuous organizational change. Because peer-level experiential knowledge is often tacit and fragmented, executives must rely on personal intuition or external consultants rather than systematic insights from peers in similar situations.

Key findings.
– Because this is an exploratory, early-stage research-in-progress study, no statistical hypothesis testing was conducted; findings are purely qualitative and based on a small sample of two senior executives.
– Qualitative feedback supported the feasibility of capturing and structuring executive reasoning through continuous, structured conversations.
– The evaluation supported the concept of the AI as a 'cognitive partner' that facilitates reflection and questioning of assumptions, rather than acting as an autonomous decision-maker.
– The participants supported the value of anonymous, spontaneous access to peer-derived insights, noting that it provides temporal and spatial independence during complex decision-making processes.
What it means for you
  • CIO / IT Executive: Schedule a scoping meeting with your Enterprise Architecture and Knowledge Management teams to outline a pilot for a conversational AI 'cognitive partner' that captures, anonymizes, and structures past project decision-logs to serve as a reflective tool for senior leadership.
  • IT Manager: Create a custom system prompt in your team's internal LLM sandbox that instructs the AI to act as a Socratic challenger rather than an answer-generator, testing it with your current project roadmap to see if it successfully exposes hidden operational assumptions.
  • Business Strategist: Before submitting your next strategic proposal, paste your core recommendations into a secure corporate AI portal and prompt it to act as an external, skeptical peer executive to identify three key blind spots and alternative perspectives in your logic.
  • Researcher: Draft a research proposal to scale this exploratory design by recruiting a cohort of 15 local mid-market CEOs for a qualitative study, measuring how structured AI-driven reflection affects their decision-making confidence over a 30-day period.
  • Policymaker: Draft a data-governance framework specifically for executive-level AI usage, defining the strict anonymization and security standards required for sharing peer-derived leadership insights across different departments without risking intellectual property leaks.
Change, Strategic Decision Making, AI Companion, AI Strategy Companion, Design Science Research, DSR
Design Knowledge Quality in Human-AI Co-Design
DESRIST 2026 · Part III

Design Knowledge Quality in Human-AI Co-Design

Sanaz Nabavian and Jeffrey Parsons
🎤 Lev Kenning & Lore Koestler
This research-in-progress paper proposes an evaluation framework for assessing prescriptive design knowledge within human-Generative AI (GenAI) co-design. The proposed methodology outlines system-level quality criteria and plans to evaluate them using a vignette-based study comparing different information retrieval techniques and prompt engineering levels.

The problem. Existing design science evaluation criteria assume human designers are the sole actors capable of interpreting and adapting abstract design principles. As GenAI tools increasingly act as co-designers, there is an unaddressed research gap regarding how to evaluate whether AI systems can successfully retrieve, interpret, and operationalize design knowledge.

Key findings.
– This is a conceptual research-in-progress paper, meaning no empirical hypotheses have been statistically tested or validated yet.
– The study successfully conceptualizes an evaluation framework comprising eight system-level quality criteria: activation, alignment, explainability, traceability, correctness, consistency, efficiency, and final quality.
– The paper proposes a vignette-based evaluation methodology using two generative information retrieval models (self-RAG and SearChain/MINDER) and three levels of prompt specificity to empirically test the framework in future research.
What it means for you
  • CIO / IT Executive: Update your organization's GenAI procurement and evaluation templates to include the paper's eight system-level quality criteria, specifically prioritizing 'traceability' and 'explainability', to assess whether prospective AI co-design tools can safely operationalize prescriptive corporate design standards.
  • IT Manager: Set up a technical spike to test your team's internal retrieval-augmented generation (RAG) tools, comparing a self-RAG approach against three levels of prompt specificity to benchmark how reliably the AI interprets and retrieves complex system architecture designs.
  • Business Strategist: Map your product development lifecycle to identify where GenAI is used as a co-designer, and establish a risk-mitigation workflow that requires human validation of AI outputs specifically against the 'correctness' and 'alignment' criteria defined in this research.
  • Researcher: Draft an experimental protocol using a vignette-based methodology to empirically test self-RAG versus SearChain/MINDER across different levels of prompt specificity, using the paper's eight conceptualized criteria as your dependent variables.
  • Policymaker: Incorporate 'traceability' and 'explainability' metrics from this co-design framework into your draft guidelines for algorithmic accountability, establishing a standard for how Generative AI systems must document their interpretation of safety-critical design principles.
Human-AI Co-design, Design Knowledge Quality, Design Knowledge Evaluation Framework, Generative AI, Information Retrieval
Customer Contribution-Aware Incident Management: A Problem Space Explication for Digitally Mediated Service Episodes
DESRIST 2026 · Part III

Customer Contribution-Aware Incident Management: A Problem Space Explication for Digitally Mediated Service Episodes

Peter Hottum and Daniel Heinz
🎤 Lev Kenning & Lore Koestler
This study uses qualitative methods to explicate the problem space of customer-dependent incident management in digital support environments. Through semi-structured interviews with seven experts at a major industrial equipment manufacturer, the researchers analyze the operational challenges that arise when service progress relies on customer contributions.

The problem. Existing IT service management frameworks often overlook how customer-dependent readiness impacts service episodes, focusing instead purely on provider-side throughput. This contribution blindness creates operational bottlenecks, as teams lack systematic methods to qualify, monitor, and route tickets based on whether the customer has provided the necessary inputs or access.

Key findings.
– Identified three core challenges in digital incident management: unqualified early contribution signals that leave work readiness ambiguous, ad-hoc tool workarounds that consume specialist capacity, and contribution-blind routing that pushes unprepared cases into expert queues.
– Derived four design objectives for future contribution-aware systems, focusing on early readiness qualification, staged intervention logic, readiness-aware flow control, and transparent decision governance.
– Developed three evaluation criteria, operational benefit, decision defensibility, and stakeholder acceptability, to guide the assessment of subsequent software artifacts.
– Acknowledged study limitations, noting that as a qualitative research-in-progress paper, the findings rely on a small sample of seven interviews within a single industrial organizational context.
What it means for you
  • CIO / IT Executive: Instruct your ITSM platform architect to implement a hard technical gate in your ticketing system (e.g., ServiceNow or Jira Service Desk) that physically blocks support agents from escalating tickets to high-cost Tier 3/expert queues unless a 'Customer Readiness Checklist', confirming the customer has provided necessary logs, configurations, and environment access, is marked as complete.
  • IT Manager: Conduct a 9:00 AM huddle with your triage team to audit the active support backlog, identify all tickets currently stalled due to missing customer inputs, and move them into a newly designated 'Awaiting Customer Contribution' holding queue to immediately shield your specialist engineers from wasting capacity on unready cases.
  • Business Strategist: Redesign the customer-facing support portal's ticket submission workflow so that high-frequency issue categories dynamically require the customer to upload specific diagnostic data or toggle a 'Remote Access Granted' checkbox before the 'Submit Ticket' button becomes active.
  • Researcher: Draft a survey instrument targeted at IT service professionals across multiple external B2B sectors to quantitatively measure the frequency and financial impact of 'contribution-blind routing' workarounds, addressing the limitation of the original study's single-firm, seven-interview sample.
  • Policymaker: Draft and issue an update to the corporate Service Level Agreement (SLA) policy framework that officially establishes 'Contribution-Adjusted SLAs,' mandating that the resolution clock automatically pauses the moment a ticket is flagged as awaiting essential customer-side readiness inputs.
Customer contribution, Incident management, Digitally mediated service episodes, Remote operations support, Design science research
Designing Process Mining to Mitigate Electronic Performance Monitoring Risks
DESRIST 2026 · Part III

Designing Process Mining to Mitigate Electronic Performance Monitoring Risks

Jannis Nacke, Ralf Plattfaut and René Riedl
🎤 Liv Knowles & Leo Kant
This study examines how Process Mining systems can be designed and embedded in organizations to support positive outcomes rather than being perceived as invasive surveillance. Using a design science research approach and an in-depth qualitative case study of a large utility organization, the authors identify key socio-technical mechanisms that mitigate electronic monitoring risks. From these mechanisms, they derive five preliminary design principles to guide the responsible implementation of analytics-based systems.

The problem. Advanced analytics tools like Process Mining offer detailed visibility into operations, but they run the risk of being perceived by employees as stress-inducing electronic performance monitoring. Prior research has focused primarily on the technical capabilities of these tools rather than on how they can be socio-technically designed to avoid triggering negative employee reactions. Consequently, there is a lack of actionable, design-oriented knowledge on how to configure and introduce these systems to foster trust and learning.

Key findings.
– Qualitative analysis supported that negative Electronic Performance Monitoring (EPM) effects did not materialize because Process Mining insights were consistently interpreted as signals of process performance rather than indicators of individual behavior.
– The study supported the formulation of five preliminary design principles: (1) Governance-Driven Trust, (2) Aggregated Process-Level Transparency, (3) Process Focus as a Boundary, (4) Guided Onboarding, and (5) Evidence-Based Steering without individual sanction.
– Empirical evidence supported that restricting visibility to aggregated, pseudonymized processes rather than individual metrics successfully shifted accountability from individual blame to process structures and supported collaborative problem-solving.
– Limitations: As a 'Research in Progress' study, the findings are limited to a single qualitative case study context of a large, highly regulated German utility provider, meaning the design principles require further long-term validation and evaluation.
What it means for you
  • CIO / IT Executive: Draft and issue a formal IT directive requiring that all active and future process mining tools (such as Celonis or UiPath) must be configured to default to aggregated, pseudonymized data, explicitly disabling any dashboards or features that track individual-level worker throughput or activity logs.
  • IT Manager: Go into the admin settings of your department's process mining software, strip out direct employee identifiers (such as names or specific user IDs) from the event log ingestion pipelines, and replace them with randomized, pseudonymized hashes before the next data refresh.
  • Business Strategist: Revise the agenda for this week's operational review meeting to remove any slides attributing process delays to individual team members, and restructure the discussion entirely around system-level flow bottlenecks (e.g., handoff delays between departments) to focus on collaborative problem-solving.
  • Researcher: Draft a research proposal for a longitudinal, multi-case study that tests the five preliminary design principles, specifically comparing their effectiveness in mitigating Electronic Performance Monitoring (EPM) anxiety in a fast-paced, non-unionized tech startup versus the highly regulated utility context of the original study.
  • Policymaker: Draft a clause for the upcoming workplace privacy and algorithmic management guidelines that mandates organizations using process-level analytics to strictly decouple system optimization data from individual disciplinary and performance appraisal processes.
Process Mining, Design Principles, Design Science Research, Electronic Performance Monitoring, Digital Surveillance, Socio-technical Design
Towards a Theory of Performance Measurement System Design
DESRIST 2026 · Part III

Towards a Theory of Performance Measurement System Design

Charlotte Bahr, Willi Tang, Victor Ulherr and Martin Matzner
🎤 Lev Kenning & Lore Koestler
This research-in-progress study explores how performance measurement systems can be designed to balance technical constraints and dynamic business requirements. The researchers apply imbrication theory as an analytical lens to investigate a case study of a medium-sized IT service provider developing a centralized dashboard platform.

The problem. Designing and implementing performance measurement systems is often resource-intensive, costly, and difficult to sustain due to technical and organizational constraints. A critical research gap exists in understanding how conceptual performance definitions, technical implementations, and organizational arrangements interrelate and influence one another over time.

Key findings.
– The study outlines that performance measurement system design consists of two interlocked subsystems: a technical subsystem (data aggregation and representation) and a conceptual subsystem (the collection of performance measures).
– Qualitative findings from the single case study suggest that material constraints of legacy, scattered data systems act as primary drivers for initiating technological change and system redesign.
– The implementation of the centralized system created new affordances, specifically multi-layered dashboards, which enabled more informed, role-specific executive decision-making routines.
– Because this is an exploratory, qualitative research-in-progress paper, the findings and design principles have not been statistically tested or systematically validated across multiple cases.
What it means for you
  • CIO / IT Executive: Schedule a 9:00 AM meeting with your lead data architect to audit your current legacy and scattered data systems, identifying the top two data silos to target for integration into a centralized dashboard platform.
  • IT Manager: Draft a visual mockup of a multi-layered dashboard and sit down with a department head to map their conceptual performance metrics directly to your existing technical data feeds.
  • Business Strategist: List the top three strategic decisions you must make next month, and define the specific, role-based performance measures (KPIs) you need visualized to make those decisions.
  • Researcher: Draft a survey instrument or a multi-case study protocol designed to quantitatively test and validate how the interlocking of technical and conceptual subsystems impacts decision-making across 50+ mid-sized companies.
  • Policymaker: Update your organization's IT governance policy to mandate that any new performance tracking software proposal must explicitly demonstrate how its technical data aggregation layer aligns with specific user roles.
Performance Measurement Systems, Imbrication Theory, Decision Support, IT Service Management, Dashboard Design, Socio-material Systems
A Theory-Driven LLM Agent Design for Generating Synthetic Data in Tele-Triage
DESRIST 2026 · Part III

A Theory-Driven LLM Agent Design for Generating Synthetic Data in Tele-Triage

Hetiao Slim Xie, Morteza Namvar, Saeed Akhlaghpour and Andrew Staib
🎤 Lev Kenning & Lore Koestler
This research-in-progress study proposes a theory-driven large language model (LLM) agent framework to generate synthetic patient self-report data from clinician-recorded medical notes. The design integrates three behavioral theories to structure the LLM's workflow, prompt architecture, and a heuristic algorithmic module. The synthetic data's utility is evaluated by using it to fine-tune a ClinicalBERT model for downstream tele-triage predictive tasks.

The problem. Developing artificial intelligence for tele-triage is severely hindered by a scarcity of patient self-report data due to strict privacy regulations and high clinical expert costs. Existing efforts to generate synthetic text lack a robust theoretical foundation, relying instead on fragile prompt engineering that fails to capture the underlying psychological and situational mechanisms of patient reporting. This data mismatch leads to poorly performing, unreliable AI models that cannot be safely deployed in clinical settings.

Key findings.
– The proposed theory-driven multi-agent framework successfully transforms structured clinical notes into naturalistic, synthetic patient self-reports (Supported).
– Downstream predictive models fine-tuned on the generated synthetic data achieved significantly improved performance, with accuracy increasing to 0.715 and F1-score to 0.729, compared to training on original clinical notes (0.632 accuracy, 0.648 F1) (Supported).
– This study is a research-in-progress, and the preliminary findings are currently limited by a lack of extensive clinical validation and comparative evaluations with real patient-reported data (Acknowledged limitation).
What it means for you
  • CIO / IT Executive: Authorize a formal security and compliance audit to establish a secure, HIPAA-compliant sandbox environment where data science teams can safely pilot LLM-based translation of de-identified clinical notes into synthetic patient narratives.
  • IT Manager: Set up a new repository and draft the system architecture for a multi-agent workflow using LangChain, defining specific system prompts that incorporate behavioral theory parameters to convert clinician-recorded medical notes into patient-style self-reports.
  • Business Strategist: Draft a business case proposal for the executive board illustrating how synthetic data generation can bypass the costly 6-to-12 month patient data acquisition bottleneck, leveraging the study's proven 12.5% accuracy increase to justify the ROI of an LLM-based triage training pilot.
  • Researcher: Draft an IRB protocol to run a double-blind validation study where clinical experts rate the naturalness, clinical plausibility, and behavioral alignment of 100 LLM-generated patient self-reports against 100 actual patient-reported intake records.
  • Policymaker: Convene a committee of clinical safety and data governance officers to draft organizational standards that mandate rigorous clinical utility testing and strict risk limits before any AI models trained on synthetic patient data are deployed in live clinical triage environments.
Tele-triage, synthetic data, design science, LLM agents, healthcare AI, behavioral simulation
Redesigning the Master’s Thesis for Epistemic Adequacy Using Design Science Research in a Generative AI (LLM)-Supported Context
DESRIST 2026 · Part III

Redesigning the Master’s Thesis for Epistemic Adequacy Using Design Science Research in a Generative AI (LLM)-Supported Context

Sylvana Kroop
🎤 Lev Kenning & Lore Koestler
This study reports on the ongoing design and initial testing of a redesigned master's thesis format based on Hevner et al.'s Design Science Research (DSR) framework. The proposed model shifts the assessment focus from text-heavy volume to a four-stage, milestone-based process spanning approximately 30 pages to combat the challenges of generative AI.

The problem. The widespread availability of large language models makes fluent academic writing a commodity, breaking the link between polished prose and individual student competence. Traditional, text-heavy master's thesis formats are highly vulnerable to AI exploitation, creating a need for assessment models that make cognitive effort and decision-making traceable.

Key findings.
– The findings are qualitative, context-bound, and not statistically generalizable, as they are based on an interpretive analysis of a pilot with 39 students at a single institution.
– Written text alone was found to be insufficient as evidence of competence; active discussion and defense of design decisions were necessary to verify student understanding.
– Traceability (explicitly linking problem insights to requirements and design choices) emerged as the strongest mechanism for establishing epistemic transparency.
– The staged, four-part milestone structure generated productive progress, with all 39 students in the cohort successfully continuing through the first three stages without dropping out.
What it means for you
  • CIO / IT Executive: On Monday morning, direct your educational technology team to design an integration between your Learning Management System (LMS) and a version-control repository (like GitHub Classroom) to automatically log timestamped commits of student project milestones, ensuring their intellectual progress is traceable.
  • IT Manager: On Monday morning, update your project delivery templates in Jira or Confluence to require a mandatory 'Decision Traceability Matrix' linking every system requirement directly to its architectural design choice, preventing team members from using LLMs to bypass documenting their actual reasoning.
  • Business Strategist: On Monday morning, replace the written proposal phase of your internal innovation program with a four-stage milestone pipeline that culminates in a live, 15-minute 'decision defense' session where teams must orally justify their strategic pivots.
  • Researcher: On Monday morning, draft a comparative pilot study protocol to test this 30-page, four-stage Design Science Research (DSR) thesis model in your own department, establishing oral defense performance as the primary metric to evaluate actual student competence.
  • Policymaker: On Monday morning, draft a policy amendment for the academic senate requiring all graduate degree regulations to mandate oral defenses and multi-stage milestone tracking, officially declaring that written text alone is no longer sufficient evidence of academic competence.
Design Science Research, master's thesis redesign, generative AI, epistemic adequacy, assessment design
From Collaboration to Toolkit: Applying eDSR in a Generative AI Knowledge Management Project
DESRIST 2026 · Part III

From Collaboration to Toolkit: Applying eDSR in a Generative AI Knowledge Management Project

Dmitry Kudryavtsev, Umair Khan, Janne Kauttonen and Jukka Remes
🎤 Lev Kenning & Lore Koestler
This study examines how echeloned Design Science Research (eDSR) can be applied to plan, organize, and coordinate a multi-year, industry-embedded research program. By using the Generative AI-enhanced Knowledge Management (GAIK) project as a use case, the authors integrate eDSR with the University-Industry Linkages (UIL) framework to guide collaborative artefact creation. The methodology combines design science cycles with phased relationship development to systematically build a business-oriented GenAI toolkit.

The problem. Industry-university collaborative innovation projects are difficult to manage due to open-ended problem spaces, multiple stakeholders, and high technical uncertainty. Companies often seek immediate, practical tools while universities target generalized, reusable research findings, creating a tension that linear project management models cannot adequately resolve. Additionally, managing the complexity of diverse intermediate artefacts makes coordination and value articulation challenging.

Key findings.
– Developed a synchronization mechanism that successfully integrates the eDSR methodology with the UIL framework to co-structure design knowledge creation and collaborative relationships.
– Successfully demonstrated the framework's utility across four project iterations of the GAIK project, resulting in the development and initial launch of a modular, six-layer Generative AI toolkit for small and medium-sized enterprises.
– Statistical hypothesis testing was not applicable to this study, as the research used a qualitative Design Science Research (DSR) case study approach to evaluate framework applicability rather than quantitative empirical testing.
What it means for you
  • CIO / IT Executive: On Monday morning, restructure your enterprise GenAI roadmap into a modular, multi-layer architecture (such as separating infrastructure, data orchestration, model training, and user application layers) to ensure that current pilot projects can be scaled as reusable tools rather than remaining as isolated, one-off solutions.
  • IT Manager: On Monday morning, update your project tracking board to categorize deliverables as 'intermediate design artefacts' (such as prompt templates, vector database schemas, and integration APIs) and mandate a weekly synchronization meeting to ensure developers and business analysts are aligned on how these components fit into the broader GenAI toolkit.
  • Business Strategist: On Monday morning, draft a collaborative innovation charter for your university R&D partnerships that explicitly outlines how academic research outputs will be incrementally converted into functional, modular tools for small and medium-sized business units, balancing long-term strategic research with immediate business utility.
  • Researcher: On Monday morning, map your theoretical research questions directly to the echeloned Design Science Research (eDSR) cycles, assigning a specific, testable software prototype or framework component to each research phase so that industry partners can evaluate your findings in real-time.
  • Policymaker: On Monday morning, initiate a review of tech-transfer grant criteria to include structured, iterative methodologies like eDSR as a requirement, ensuring funded university-industry partnerships demonstrate how they will co-create practical, modular AI toolkits that directly benefit regional SMEs.
design science research, university-industry linkages, generative AI, knowledge management, eDSR, GAIK project
LLM Buddy: An AI-Augmented Research Environment for Auditable Design Science
DESRIST 2026 · Part III

LLM Buddy: An AI-Augmented Research Environment for Auditable Design Science

Anthony Vigil and Matthew Mullarkey
🎤 Liv Knowles & Leo Kant
This study details the development and evaluation of LLM Buddy, an AI-augmented research environment designed to capture and trace LLM prompts and responses. The prototype environment was evaluated through a longitudinal field deployment spanning a six-iteration Elaborated Action Design Research (eADR) project.

The problem. Conventional version control tools track code changes but fail to record the prompts and decision-making context of AI-assisted development. This gap creates a traceability barrier, threatening the reproducibility, auditability, and rigor of AI-augmented design science research.

Key findings.
– LLM Buddy successfully captured and archived 1,555 distinct AI prompts across multiple language models during a six-iteration field deployment, demonstrating feasibility (supported empirically).
– The tool's automated snapshot system successfully enabled full recovery from an AI-induced file corruption incident, saving estimated development time (supported empirically).
– Analysis of the captured prompt histories supported the identification of a recurring non-linear interaction pattern termed 'Conversational Forking' (supported qualitatively).
– Acknowledged limitation: The evaluation was confined to a single-investigator eADR project, meaning broad cross-context generalizability remains to be verified in future work.
What it means for you
  • CIO / IT Executive: On Monday morning, issue a directive to your Enterprise Architecture team to evaluate and integrate automated prompt-logging and file-snapshotting capabilities into your internal developer platform (IDP) to establish code auditability and mitigate the risks of AI-induced codebase corruption.
  • IT Manager: On Monday morning, set up a local Git-commit hook or automated backup script in your team's repository that takes a quick snapshot of the codebase immediately before a developer runs any AI code-generation prompts, ensuring instant recovery if the LLM corrupts working files.
  • Business Strategist: On Monday morning, work with your AI enablement team to redesign prompt engineering training around 'Conversational Forking' patterns, guiding product teams to deliberately branch their LLM chats into parallel sessions to rapidly test and compare different product business models.
  • Researcher: On Monday morning, establish a dedicated local markdown log or database to systematically capture and timestamp every single prompt, response, and model configuration used in your current research iteration to ensure your methodology is fully auditable and reproducible.
  • Policymaker: On Monday morning, draft an update to your institution's AI ethics and compliance guidelines, mandating that any research or product development utilizing LLMs must maintain an immutable, traceable log of prompt histories to qualify for regulatory clearance or funding.
Design Science Research, Promptware Engineering, Large Language Models, Prompt Management, Research Transparency, Traceability
Designing a Human-In-The-Loop Clustering Information System for Automotive Field Observations
DESRIST 2026 · Part III

Designing a Human-In-The-Loop Clustering Information System for Automotive Field Observations

Lukas-Orlando Ulmer, Nicole Maria Schempp and Miriam Gräf
🎤 Liv Knowles & Leo Kant
This study designs, instantiates, and evaluates a Human-in-the-Loop Information System (HITL IS) to support field analysts in automotive quality control. Grounded in Cognitive Fit Theory, the researchers developed four design principles and instantiated them in a web-based prototype. The prototype was evaluated through a survey-based study with 20 practitioners to measure cognitive workload and usefulness.

The problem. Automotive aftersales organizations face rising vehicle complexity and high volumes of heterogeneous repair data, making failure classification cognitively demanding. Automated machine learning models alone cannot handle ambiguous data and contextual nuances, yet current tools fail to support interactive collaboration between human experts and models, leading to high manual effort and inconsistent failure detection.

Key findings.
– The prototype demonstrated a statistically significant reduction in perceived mental demand (p = 0.009), time pressure (p = 0.004), and frustration (p < 0.001) compared to a neutral baseline midpoint.\n- Hypotheses that the prototype significantly altered perceived effort (p = 0.132) and perceived success (p = 0.336) compared to the neutral midpoint were not statistically supported.\n- The study is limited by a small sample size of 20 practitioners and reliance on a video-based prototype demonstration rather than a longitudinal field evaluation in a live environment.
What it means for you
  • CIO / IT Executive: On Monday morning, initiate a pilot project to integrate Human-in-the-Loop (HITL) interface principles into the current automotive aftersales software stack, shifting resources away from 'black-box' fully automated ML models and focusing instead on interactive tools that allow analysts to manually adjust data clustering.
  • IT Manager: On Monday morning, schedule a 90-minute workshop with your lead UX designer and lead field analysts to map out the current failure classification workflow, specifically identifying the top three ambiguous repair data categories where automated categorization fails and causes high analyst frustration.
  • Business Strategist: On Monday morning, update the business case for your department's digital transformation roadmap to include 'analyst cognitive fatigue' and 'turnover risk' as key metrics, justifying investments in HITL tools based on their proven ability to reduce frustration and mental demand rather than expecting immediate headcount reductions.
  • Researcher: On Monday morning, draft a research proposal for a 6-month longitudinal field study with a partner automotive dealership network to test the HITL prototype in a live environment, aiming for a sample size of at least 50 active practitioners to address the statistical limitations of the initial video-based evaluation.
  • Policymaker: On Monday morning, draft a corporate data-governance standard requiring a mandatory human-in-the-loop (HITL) verification step for any machine learning systems used to classify high-risk or safety-critical automotive failure modes, preventing total reliance on automated models.
Human-in-the-Loop, Interactive Clustering, Design Science Research, Human-AI Collaboration, Cognitive Fit Theory
Towards Agentic Lecture Production with Human-AI Workflows
DESRIST 2026 · Part III

Towards Agentic Lecture Production with Human-AI Workflows

Johannes Sahlin, Stefan Cronholm, Björn Dahlstrand and Håkan Sundell
🎤 Lev Kenning & Lore Koestler
This study designs and formatively evaluates a prototype system for agentic lecture production that orchestrates human-AI workflows. The system integrates markdown-based content authoring, automated slide generation, text-to-speech narration, and avatar video rendering into a single coordinated pipeline.

The problem. Educators experience high coordination overhead and tool fragmentation because multimedia lecture elements (slides, audio, video) are produced across disconnected platforms. This leads to version drift, duplicated effort, and a lack of traceability between original source content and final video outputs.

Key findings.
– The formative evaluation was qualitative and context-bound, meaning no statistical hypothesis testing was performed.
– Qualitative feedback from six university teachers indicated that the system successfully improved workflow coherence, traceability, and production speed.
– Operational feasibility was demonstrated during deployment, where five video lectures based on distinct textbook chapters were produced within a single working day.
– Users expressed support for the system's ability to maintain human oversight but noted a desire for more granular control over AI actions, such as specifying the exact number of slides generated.
– The study's findings are limited by a small participant sample size (n=6) and the lack of longitudinal data on long-term adoption.
What it means for you
  • CIO / IT Executive: On Monday morning, initiate a pilot project to integrate our institution's disconnected slide, audio, and video creation tools into a single markdown-based automated workflow to reduce coordination overhead for faculty.
  • IT Manager: On Monday morning, update the configuration of our AI media tools to add granular user controls, specifically allowing creators to manually input the exact number of slides and narration segments to generate before running the AI pipeline.
  • Business Strategist: On Monday morning, build a commercial viability model for rapid course deployment based on the benchmark of producing 5 video lectures in a single day, targeting departments with high course-development backlogs.
  • Researcher: On Monday morning, write a study protocol to evaluate the long-term adoption of agentic content tools, designing a longitudinal experiment with a larger sample size (n > 30) to track user retention and workflow speed over a full semester.
  • Policymaker: On Monday morning, draft an academic governance directive that requires a mandatory human oversight step in our curriculum development workflow, ensuring no AI-generated lecture video or avatar is published without final faculty validation.
Agentic Workflows, Lecture Production, Human-AI Collaboration, Educational Technology, Design Science Research
Selbstlernen.app 2.0, A Feedback-Oriented Mobile Application for Supporting Self-regulated Learning in Higher Education
DESRIST 2026 · Part III

Selbstlernen.app 2.0, A Feedback-Oriented Mobile Application for Supporting Self-regulated Learning in Higher Education

Madeleine Dormeyer and Alexander Herwix
🎤 Lev Kenning & Lore Koestler
This study presents the design, development, and evaluation of a mobile application designed to scaffold self-regulated learning (SRL) in higher education. Grounded in established SRL and feedback models, the standalone prototype guides students through planning, enacting, and reflecting on their learning sessions. The design was evaluated using expert feedback, a one-week user trial, and subsequent interviews.

The problem. University students frequently struggle to plan, monitor, and adapt their academic learning processes independently. While digital learning environments can track learning behaviors, existing tools often address only isolated parts of the learning cycle or present descriptive visual data that students struggle to interpret without guidance.

Key findings.
– Field test participants perceived the application as intuitive and useful for structuring their study sessions.
– Feedback visualisations tracking goals, tasks, and focus time were found to be useful for informing subsequent study habits, whereas mood and focus-prompt visualisations were reported as less actionable.
– While notifications appeared to support regular usage patterns for some participants, they were not found to clearly increase intrinsic motivation to learn.
– Due to the pilot study's limited scale (six participants over a one-week period), findings are descriptive and qualitative rather than statistically generalizable.
What it means for you
  • CIO / IT Executive: Review your institution's digital learning roadmap and halt any plans to procure or develop standalone mood-tracking or push-notification tools for students. Instead, instruct your enterprise architecture team to prioritize integrating simple goal, task, and focus-time tracking widgets directly into the existing Learning Management System (LMS).
  • IT Manager: Open the sprint backlog for your student portal or mobile app development team and deprioritize features related to 'mood logs' and 'motivational push notifications.' Instead, assign a developer to build a clean dashboard widget that visualizes student-defined study goals and historical focus-time data.
  • Business Strategist: Update the value proposition and marketing collateral of your educational software to focus on 'evidence-based self-regulated learning support', specifically highlighting goal-completion visual analytics rather than generic engagement notifications, which research shows do not drive intrinsic motivation.
  • Researcher: Draft a research protocol and institutional review board (IRB) application for a 6-week, mixed-methods study with a cohort of at least 100 undergraduate students to quantitatively test how structured visual feedback on task completion impacts long-term academic performance compared to unstructured study habits.
  • Policymaker: Draft a policy directive for the academic affairs committee to mandate the integration of digital self-regulated learning (SRL) training, specifically instructing students on how to set explicit study goals and track focus time, into the curriculum of all mandatory first-year student orientation seminars.
Self-regulated Learning, Learning Analytics, Feedback, Mobile Application, Higher Education, Design Science Research
Navigating Complexities at Home: A Residential Decision Support System for Flexible Electricity Consumption
DESRIST 2026 · Part III

Navigating Complexities at Home: A Residential Decision Support System for Flexible Electricity Consumption

Lorenzo Matthias Burcheri, Hicham Rahali, Mehdi Testouri, Raphael Frank and Gilbert Fridgen
🎤 Lev Kenning & Lore Koestler
This study designs and evaluates a mobile decision support system that simplifies and personalizes dynamic electricity signals like prices and carbon intensities for households. The prototype was evaluated in an eight-week staggered field experiment tracking high-frequency smart-meter data. The methodology isolates the behavioral effects of single indicators, an aggregated master indicator, and alert notifications.

The problem. While dynamic tariffs and carbon-intensity variations offer pathways to decarbonization, consumers struggle to understand and react to highly volatile and complex utility data. Most households lack accessible, low-cost tools to translate these abstract signals into timely, actionable consumption adjustments. This cognitive barrier prevents active demand response even when users are environmentally or financially motivated.

Key findings.
– The Single Indicators feature produced a statistically significant cumulative increase in consumption of 44.33 kWh per user (p < 0.01), driven primarily by a highly significant hourly increase of 0.072 kWh during low-carbon green periods (p < 0.01).
– Price and Regional Share Single Indicators had no statistically significant effects on energy consumption, suggesting low price salience in this high-income setting.
– The Master Indicator's overall cumulative reduction of 8.5 kWh was not statistically significant.
– Disaggregated analysis of the Master Indicator revealed that urgent, risk-framed red signals led to a statistically significant hourly consumption reduction of 61 Wh (p < 0.1), whereas green signals had no significant effect.
– This exploratory study is limited by its small sample of 23 Luxembourg households, requiring larger-scale studies to confirm generalizability.
What it means for you
  • CIO / IT Executive: Convene a meeting with your lead software architect to re-prioritize the energy app product roadmap, shifting engineering resources away from complex, multi-variable 'master indicators' and instead focusing development on high-visibility 'green-period' scheduling indicators and 'red' risk-framed push notifications.
  • IT Manager: Open Jira and write the technical user stories for the next development sprint to build an automated push-notification service that triggers an urgent, risk-framed warning (e.g., 'Grid carbon intensity is high, please delay heavy appliance use') whenever regional grid emissions exceed a pre-defined threshold.
  • Business Strategist: Rewrite the marketing and value-proposition copy for your high-income customer segments, removing financial savings messaging (which has low salience) and instead highlighting the app's 'low-carbon green period' scheduling feature to appeal to their environmental motivations.
  • Researcher: Draft a research proposal and budget request for a multi-regional randomized controlled trial (RCT) targeting at least 1,000 households to test if the behavioral response to single 'green' indicators and risk-framed 'red' alerts holds true across diverse socio-economic and lower-income demographics.
  • Policymaker: Draft a policy briefing memo for the regional utility commission recommending a new regulatory standard that requires smart-meter mobile applications to display standardized, real-time 'low-carbon green windows' rather than relying solely on dynamic pricing signals to drive grid flexibility.
Energy Transition, Demand Response, Residential Decision Support System, Smart Meters, Digital Interventions, Behavioral Economics
Contextualized Interpretability for AI Model Quality Assessment: Designing an LLM-Based Decision Companion for Domain Experts in LCNC Environments
DESRIST 2026 · Part III

Contextualized Interpretability for AI Model Quality Assessment: Designing an LLM-Based Decision Companion for Domain Experts in LCNC Environments

Benjamin Gigerl, Claris Chung and Stefan Thalmann
🎤 Lev Kenning & Lore Koestler
This study designs and evaluates AIMQA, an interactive conversational assistant powered by an LLM to support non-technical domain experts in assessing AI model quality within low-code/no-code (LCNC) platforms. Grounded in sensemaking theory, the companion translates technical performance metrics into context-aware, domain-specific explanations.

The problem. Although LCNC platforms democratize AI development, domain experts struggle to evaluate model quality because performance metrics are presented in technical, decontextualized terms. This lack of transparency forces reliance on technical specialists and hinders domain experts from confidently trusting, interpreting, and refining their models.

Key findings.
– A qualitative evaluation with 20 participants supported that linking quantitative metrics to concrete classification outcomes (such as a confusion matrix) was clear and understandable.\n- The LLM-based interactive dialogue was qualitatively supported as useful for helping users uncover the causes of specific model behaviors.\n- No participants rated the artifact's interpretability support as difficult to understand.\n- Limitations of the study include its small sample size of 20 participants evaluated in a qualitative, specific KNIME and Snowflake environment, alongside user demands for more explicit guidance on model refinement steps.
What it means for you
  • CIO / IT Executive: Initiate a pilot project to integrate conversational LLM 'decision companions' into your organization's active Low-Code/No-Code (LCNC) analytics platforms (such as KNIME or Snowflake) to help non-technical department heads independently evaluate and trust their business models.
  • IT Manager: Configure an LLM-powered API assistant connected to your LCNC platform's model registry that automatically translates raw model performance metrics and confusion matrices into plain-language, interactive explanations for business users.
  • Business Strategist: Identify the specific business units where LCNC AI adoption is bottlenecked by a reliance on data scientists for model validation, and map out their unique domain terminologies to customize future AI-interpretability companions.
  • Researcher: Design an empirical study with a sample size greater than 20 that tests an extended version of the AIMQA framework, specifically evaluating how to generate automated, step-by-step instructions for model refinement based on conversational user feedback.
  • Policymaker: Revise internal AI governance policies to mandate that any LCNC-developed AI models must feature a context-aware, non-technical explanation layer (such as interactive dialogs translating confusion matrices) before being approved for production.
Contextualized Interpretability, AI Model Quality Assessment, Low-Code / No-Code (LCNC), Design Science Research, Sensemaking Theory, Large Language Models
Automated Documentation for Reproducible Research: The Reproducible AI Documentation (RepAID) Tool
DESRIST 2026 · Part III

Automated Documentation for Reproducible Research: The Reproducible AI Documentation (RepAID) Tool

Armin Haberl and Stefan Thalmann
🎤 Lev Kenning & Lore Koestler
This study design and evaluates the Reproducible AI Documentation (RepAID) tool, which is integrated into the KNIME Analytics Platform. The tool extracts ML workflow metadata and utilizes Large Language Models (LLMs) to automatically generate standardized reproducibility reports. The researchers evaluated the prototype through comparative lab testing and semi-structured interviews with target users.

The problem. While low-code machine learning tools enable non-technical researchers to implement complex models, these platforms often lack transparency and code-sharing capabilities. This makes it difficult for researchers to satisfy rigorous journal and conference documentation standards manually. The research gap lies in the lack of automated, customizable tools to bridge the mismatch between low-code ease of use and strict reproducibility requirements.

Key findings.
– Lab evaluations qualitatively demonstrated that the RepAID tool successfully generated accurate, structurally consistent documentation that detailed internal workflow processes better than manual records (statistical significance not tested due to qualitative design).
– Qualitative feedback from six researchers supported the tool's ease of use and potential to reduce manual documentation effort, though results are restricted by the small sample size.
– Evaluation revealed limitations including the tool's inability to document external context (like dataset sources or hardware configurations) and a potential risk of user automation bias.
What it means for you
  • CIO / IT Executive: Initiate a pilot of the RepAID tool within your organization's KNIME Analytics Platform and mandate a 'Human-in-the-Loop' sign-off policy for all automated documentation to mitigate the risk of user automation bias before reports are published.
  • IT Manager: Configure a standardized, mandatory intake form in your team's project management tool to capture external metadata (such as dataset sources and hardware configurations) that RepAID cannot automatically extract, ensuring complete documentation packages.
  • Business Strategist: Conduct an audit of the time spent by your data science and research teams on manual documentation, and build a business case for adopting automated tools like RepAID to reallocate those hours toward high-value model development.
  • Researcher: Download and install the RepAID extension in your KNIME workspace on Monday morning, run it on your current active workflow, and manually append a 'Context' section to the generated report detailing your specific dataset origins and hardware specs.
  • Policymaker: Update your organization's or journal's scientific submission guidelines to explicitly accept automated reproducibility reports (like those from RepAID), provided they are accompanied by a signed checklist verifying the accuracy of the LLM-generated content.
Low-Code, Machine Learning, Reproducibility, Documentation, Automation
TerrainGrade: An Artifact for Flood Susceptibility Mapping
DESRIST 2026 · Part III

TerrainGrade: An Artifact for Flood Susceptibility Mapping

Thomas Roderick, Monica Tremblay, Rajiv Kohli and Arturo Castellanos
🎤 Liv Knowles & Leo Kant
This study introduces TerrainGrade, a design science artifact that generates cell-level flood susceptibility estimates using open national hydrologic data. The methodology leverages a machine learning pipeline implemented on an H3 hexagonal grid to produce continuous flood probability indexes. The artifact was evaluated through a pilot implementation covering 2.2 million spatial cells in Maryland.

The problem. Traditional flood maps are costly to update and present risk as binary zones, making them difficult to integrate with modern geospatial and machine learning workflows. Furthermore, existing hydrologic models do not directly yield cell-based inundation layers suitable for continuous hazard assessment. This creates a gap for planners and insurers who require scalable, granular, and continuous flood risk screening.

Key findings.
– The machine learning pipeline demonstrated high predictive performance under spatial cross-validation, achieving out-of-fold AUC values of 0.984, 0.978, and 0.969 across three increasing flood severity thresholds (statistically supported).
– External validation against FEMA maps confirmed that designated flood zones received statistically significantly higher susceptibility scores (mean increase of 0.198, Welch t = 131.9, Cohen's d = 0.48), despite FEMA data not being used for training (statistically supported).
– The pilot study is limited to the state of Maryland and exhibits reduced predictive stability in flat coastal terrains, meaning further multi-state data is required to generalize findings across different geographic regions (acknowledged limitation).
What it means for you
  • CIO / IT Executive: On Monday morning, authorize a technical assessment of your spatial data infrastructure to support Uber's H3 hexagonal hierarchical spatial index, laying the foundation for your database to transition from legacy binary shapefiles to cell-level, continuous risk-scoring architectures.
  • IT Manager: On Monday morning, download open-source national hydrologic data for your target territory and configure a Python-based ETL pipeline using the H3 library to partition the geographic region into hexagonal cells, preparing the spatial framework for machine learning model ingestion.
  • Business Strategist: On Monday morning, retrieve your organization's real estate or insurance portfolio data for Maryland and cross-reference assets located outside official FEMA flood zones with their TerrainGrade continuous susceptibility scores to identify and reprice hidden high-risk liabilities.
  • Researcher: On Monday morning, initiate a research project to refine the TerrainGrade ML pipeline by integrating coastal bathymetry and tidal boundary data, specifically targeting a flat coastal region like Florida to resolve the model's documented predictive instability in low-lying coastal terrains.
  • Policymaker: On Monday morning, draft a policy directive requiring state municipal planning departments to pilot continuous, cell-level hazard probability indexes alongside traditional binary FEMA maps when evaluating zoning permits and infrastructure resilience funding.
Flood susceptibility, Design science research, H3 spatial index, Machine learning, National Water Model, CFIM, HAND
PaperMate: A Prototype for Supporting Comprehension of Scientific Texts for Non Native Academic Readers
DESRIST 2026 · Part III

PaperMate: A Prototype for Supporting Comprehension of Scientific Texts for Non Native Academic Readers

Martin Hänel, Kevin Fred Mwaita, Thiemo Wambsganss and Matthias Söllner
🎤 Liv Knowles & Leo Kant
This study introduces PaperMate, a canvas-based reading support system designed to help non-native academic readers comprehend complex scientific texts. Grounded in the Active Reader View, the prototype integrates simplification, structural guidance, key info extraction, and conversational clarification into a unified, user-controlled interface. The system was developed and refined using iterative design science research cycles, including laboratory and field testing.

The problem. Non-native English speakers face significant linguistic, structural, and conceptual barriers when engaging with English scientific literature, leading to comprehension errors and longer reading times. Existing reading assistants typically offer only isolated tools (such as translation or summarization) in fragmented interfaces, which fails to support a deep, holistic understanding of the text. Additionally, educational institutions lack the scalable resources needed to offer personalized reading support to these students.

Key findings.
– The study successfully designs and implements PaperMate, a unified canvas-based prototype that operationalizes five core design principles to facilitate text comprehension.
– Specific statistical hypothesis testing and quantitative performance outcomes from the laboratory evaluations (with 140 and 155 participants) are not reported in this article, as the text focuses primarily on the prototype's architectural design and development process.
– Preliminary proof-of-value evaluations qualitatively supported the system's feasibility and perceived usefulness, though comprehensive field validation in real-world environments is still ongoing.
What it means for you
  • CIO / IT Executive: Initiate an enterprise software audit to identify fragmented translation and summarization extensions currently used by staff, and allocate budget to pilot a unified, canvas-based reading workspace that integrates these tools to reduce cognitive load for international researchers.
  • IT Manager: Build a pilot environment using an open-source canvas tool (such as Obsidian or a custom HTML canvas) integrated with LLM APIs for side-by-side text simplification and conversational clarification, and recruit five non-native English-speaking researchers to test it with their current literature pipelines.
  • Business Strategist: Review and update the product development roadmap of our EdTech or publishing platform to transition from standalone widgets (like hover-dictionaries) to a holistic 'Active Reader' canvas workspace, targeting the high-growth market of non-native English academic consumers.
  • Researcher: Draft a protocol for a controlled A/B testing experiment to quantitatively measure the differences in reading speed, comprehension errors, and cognitive load between researchers using a unified canvas-based interface and those using fragmented browser extensions.
  • Policymaker: Draft and issue a digital equity directive requiring the university’s procurement office to prioritize learning management systems (LMS) and e-library vendors that offer integrated, non-native language reading support systems as standard accessibility features.
Design Science Research, Reading Comprehension, Reading Support System, Non Native Readers, Active Reader View
Developing AI Literacy of Novice Adult Learners Outside of Formal Education Settings, a Prototype
DESRIST 2026 · Part III

Developing AI Literacy of Novice Adult Learners Outside of Formal Education Settings, a Prototype

Alexander Rinkowski and Dennis Kundisch
🎤 Lev Kenning & Lore Koestler
This study designs, implements, and evaluates a theory-grounded, practice-oriented AI literacy course prototype for novice adult learners outside formal educational settings using Action Design Research. Grounded in a digital literacy framework and Dynamic Skills Theory, the multi-session course uses low-tech metaphors and an hourglass instructional design to transition learners toward self-directed AI use.

The problem. The widespread public availability of generative AI has made AI literacy essential, yet nearly half of adults in the EU lack basic digital literacy, leaving them underserved. Existing educational resources either lack theoretical grounding to build accurate mental models or focus strictly on formal educational environments rather than independent adult learners.

Key findings.
– In self-reported MAILs data, learners' AI literacy scores consistently improved from <=2 points to >=8 points on a 0-10 scale across course sessions (statistically supported).
– Performance tests and chatbot interaction logs confirmed that participants successfully progressed from basic search-like inputs to structured multi-turn conversation and prompting strategies (statistically supported).
– Acknowledged Limitations: The findings are based on a small, self-selected sample of participants (initial cohort of 12 older adults, followed by three similar small-scale iterations), which limits the immediate scalability and generalizability of the prototype.
What it means for you
  • CIO / IT Executive: Allocate budget to pilot a non-technical AI literacy program for administrative and non-IT staff, utilizing the study's 'hourglass instructional design' and low-tech metaphors to transition employees from basic search-like queries to productive multi-turn chatbot interactions.
  • IT Manager: Review current helpdesk and internal AI chatbot usage logs to identify users struggling with basic, search-like inputs, and provide them with a simple template guiding them on how to construct structured, multi-turn prompts.
  • Business Strategist: Develop an upskilling roadmap for frontline and low-digital-literacy business units, using the Dynamic Skills Theory framework to gradually build their AI self-efficacy and drive self-directed AI adoption across legacy departments.
  • Researcher: Draft a grant proposal or research design to replicate this Action Design Research study with a larger, randomized, and demographically diverse cohort to test the scalability of the prototype beyond small-scale groups of older adults.
  • Policymaker: Initiate a partnership between local municipalities and informal community education centers to fund and deploy low-barrier, metaphor-based AI literacy courses, targeting the nearly 50% of adult citizens lacking basic digital skills.
AI literacy, AI competence, AI skills, GenAI, Generative AI, Action Design Research
LitFlow: An Integrated, AI-Augmented Systematic Literature Review Platform
DESRIST 2026 · Part III

LitFlow: An Integrated, AI-Augmented Systematic Literature Review Platform

Hans-Henning Näscher, Timo Strohmann and Jan Vom Brocke
🎤 Liv Knowles & Leo Kant
This study presents and evaluates LitFlow, a web-based platform designed to support systematic literature reviews through transparent, researcher-controlled AI augmentation. Developed using design science research, the platform integrates multi-database searching, screening, data extraction, and audit-trail generation in a single workspace. The authors conduct a qualitative, formative evaluation using think-aloud interviews with five researchers to assess usability and workflow completeness.

The problem. Conducting systematic literature reviews is traditionally a highly time-consuming process that depends on fragmented, non-interoperable toolchains. While emerging AI-based tools attempt to streamline this process, they often prioritize automation over methodological transparency, which compromises researcher agency, decision auditability, and rigor.

Key findings.
– The evaluation was purely qualitative and exploratory, based on a limited sample of five researchers, meaning no statistical hypotheses were quantitatively tested.
– Participants consistently reported that the integrated workflow provided high value, specifically the multi-database search feature.
– Participants indicated they felt in full control of decisions, viewing the AI as a restrained assistant rather than an autonomous decision-maker.
– AI support in query refinement was qualitatively perceived as less useful than AI assistance during screening and data extraction.
– Participants raised concerns regarding potential AI anchoring effects, suggesting that viewing AI recommendations first could bias independent researcher judgment.
What it means for you
  • CIO / IT Executive: Review the procurement pipeline for academic and market research software to mandate that any AI-enabled tools must feature transparent, researcher-controlled workflows with built-in audit trails rather than automated, black-box decision-making.
  • IT Manager: Adjust the default UI configuration of our deployed AI-assisted screening and extraction tools to hide AI recommendations or confidence scores until after the human researcher has logged their initial independent assessment to mitigate cognitive anchoring bias.
  • Business Strategist: Reallocate development resources for our internal competitive intelligence products away from AI-driven query-refinement features, and instead focus on building a unified workspace that integrates multi-source search and automated data extraction.
  • Researcher: When beginning your next literature review, draft your search queries manually and blind yourself to AI-generated screening suggestions until after you have made your first pass of independent assessments to protect your methodological rigor from anchoring bias.
  • Policymaker: Draft and publish updated institutional guidelines for scientific integrity that require any researchers using AI-augmented review tools to export and submit an immutable, step-by-step audit trail documenting human versus machine decisions.
Systematic Literature Review, AI Augmentation, Design Science Research, Large Language Models, Workflow Integration, Human-AI Collaboration
Bridging the Gap: A Hybrid Intelligence Decision Support System for B2B Pricing
DESRIST 2026 · Part III

Bridging the Gap: A Hybrid Intelligence Decision Support System for B2B Pricing

Josef Valentin, Tobias Hornbogen, Thomas Haskamp and Jan Vom Brocke
🎤 Liv Knowles & Leo Kant
This study presents the design and evaluation of a web-based, hybrid intelligence Decision Support System (DSS) developed in collaboration with a large German wholesaler to assist in B2B pricing. The system utilizes K-Nearest Neighbors as a Case-Based Reasoning mechanism to provide frontline sales agents with transparent, transaction-specific price corridors. The artifact was evaluated through an offline simulation of 160,000 transactions and field-validated with experienced sales agents.

The problem. B2B pricing is frequently delegated to frontline sales agents who default to excessive discounting under time pressure, causing margin erosion. Standard data-driven automation is hindered by severe data censoring, as historical databases typically do not record rejected quotes, rendering optimal price points unrecoverable. Consequently, companies must find a way to guide agent negotiations without relying on opaque black-box optimization or rigid, non-adaptive static floor-and-target rules.

Key findings.
– An offline simulation of 160,000 recent transactions demonstrated estimated revenue uplifts of 8.2% to 12.0% across simulated negotiation scenarios (statistically supported in simulation).
– The same offline simulation demonstrated estimated profit uplifts of 26.8% to 38.9% across simulated negotiation scenarios (statistically supported in simulation).
– Field validation with experienced sales agents confirmed that the generated pricing corridors were plausible, realistic, and aligned with practical negotiation workflows (supported).
– A limitation is that the evaluation relies on offline simulation and is situated within the context of a single German finishing trades wholesaler; actual real-world transactional outcomes remain to be fully integrated and validated in live field environments.
What it means for you
  • CIO / IT Executive: Meet with your enterprise architecture and ERP integration teams to initiate a technical feasibility assessment for embedding a K-Nearest Neighbors (KNN) pricing recommendation engine directly into the frontline sales portal, prioritizing transparent pricing corridors over opaque black-box AI.
  • IT Manager: Direct your database administrators to modify the CRM and quoting system schemas to immediately begin capturing and logging 'rejected' quotes, not just accepted transactions, to eliminate data censoring for future machine learning models.
  • Business Strategist: Draft a business case for a pilot program targeting a subset of sales agents, replacing rigid static discount floors with dynamic, transaction-specific price corridors to capture a portion of the simulated 26.8% to 38.9% profit uplift.
  • Researcher: Write a research proposal to partner with a B2B firm for a randomized controlled field trial that transitions the K-Nearest Neighbors pricing DSS from offline simulation to live, real-world transactional environment validation.
  • Policymaker: Draft an internal corporate pricing governance policy that mandates 'algorithmic explainability' for all sales-enablement tools, ensuring any AI-driven pricing recommendation can be justified to agents using historical peer transactions.
Dynamic Pricing, Decision Support System, Censored Data, Case-Based Reasoning, B2B Pricing, Hybrid Intelligence
Prototyping VR Training Using Design Thinking and ADR
DESRIST 2026 · Part III

Prototyping VR Training Using Design Thinking and ADR

Stefan Nilsson, Daniel Sjölie, Ulf Andersson, Zakarias Mortensen, Darmin Poturovic, Jesse Katende and Amir Haj-Bolouri
🎤 Lev Kenning & Lore Koestler
This study explores the collaborative design and development of a Virtual Reality (VR) de-escalation training system for onboard train managers. By integrating Design Thinking within an Action Design Research framework, the researchers developed and evaluated low-cost Minimum Viable Products to capture complex real-world practices before committing to full VR development.

The problem. Onboard train staff face a high risk of passenger aggression within restricted and moving train carriages, but authentic practice opportunities are rare. Traditional classroom training fails to simulate the psychological stress and physical constraints of these encounters, while developing high-fidelity VR scenarios is typically cost-prohibitive and technically rigid.

Key findings.
– Qualitative evaluation supported the prototype's high utility and realism, with 21 experienced practitioners confirming its effectiveness as a safe space to experiment with different responses, although formal quantitative statistical testing was not conducted.
– Technical evaluation validated that the modular, JSON-based scenario architecture successfully decoupled content from code, enabling rapid, low-cost iterations without requiring engine re-programming.
– Limitation: The study is limited by its specific context (Sweden's largest train operator) and a small qualitative sample size of 21 participants, who indicated that the system would benefit from a wider variety of interactive choices.
What it means for you
  • CIO / IT Executive: Draft an IT procurement and development directive mandating that all future virtual reality (VR) or immersive training platform acquisitions must utilize a decoupled, JSON-based scenario architecture, ensuring that non-developer content creators can update training content without requiring expensive software engineering re-programming.
  • IT Manager: Schedule a Monday morning sprint-planning task to set up a Git repository and a JSON schema validator for your VR development team, initiating the decoupling of training dialogue scripts from the core game engine to enable rapid, low-cost scenario iterations.
  • Business Strategist: Identify the top three high-risk passenger conflict scenarios for onboard staff and allocate a small budget to develop a low-cost VR Minimum Viable Product (MVP) focusing specifically on these scenarios, using the feedback from a pilot group of 20+ experienced staff to map out highly interactive decision branches.
  • Researcher: Write a research proposal to conduct a quantitative, randomized controlled trial comparing the learning retention and stress responses of transit staff using this modular VR training versus traditional classroom training, addressing the qualitative-only limitation of the current study.
  • Policymaker: Draft an amendment to transit worker safety guidelines recommending that public transportation operators supplement mandatory physical safety training with low-cost, immersive virtual reality de-escalation simulations to safely prepare staff for high-stress carriage environments.
Design thinking, VR training, de-escalation, ADR, Action Design Research, Virtual Reality
Scroll to Top