
Artificial intelligence is rapidly becoming part of everyday software development. Developers ask large language models to explain SQL queries, generate unit tests, review pull requests, optimize cloud architectures, or document existing code. Most of these interactions happen through chat interfaces, making them useful for individual productivity but difficult to integrate into engineering workflows.
At Data S2, we started from two different questions: What if AI capabilities were treated as infrastructure rather than as conversations? How do you build a low-cost agent that is data-driven?
Instead of building another chatbot, we wanted to investigate whether AI could become a programmable service that behaves like any other component in a modern software architecture: observable, testable, reusable, and extensible.
The result of this research is Samy, an open-source AI-first assistant designed for data and engineering teams. Although Samy is still under active development, its architecture represents one of our first public experiments in treating artificial intelligence as an engineering platform instead of a user interface.
AI as an Engineering Primitive
Modern software systems expose capabilities through APIs. Databases expose query APIs. Cloud providers expose infrastructure APIs. Observability platforms expose metrics APIs. Why should AI be different?
Rather than embedding prompts inside applications, Samy exposes AI capabilities through a consistent API organized around skills. Each skill represents a specific engineering capability.
Instead of asking a general-purpose assistant to “help with SQL,” an application can request a dedicated SQL explanation skill. Instead of a generic code generation prompt, a service can invoke a Python refactoring skill or a BigQuery optimization skill.
The interaction becomes deterministic from an architectural perspective while remaining flexible from a reasoning perspective. This distinction may appear subtle, but it fundamentally changes how AI integrates into software systems.
Skills Instead of Prompts
One of the central architectural ideas behind Samy is that prompts should not become application logic. Prompt engineering tends to produce isolated pieces of text scattered across applications. Skills encapsulate this knowledge. Each skill has a clear objective, receives structured inputs, applies optional contextual knowledge, invokes the language model, and produces a predictable output.
Today Samy includes skills across multiple engineering domains. For SQL, the platform can explain queries, review them, optimize execution plans, and generate SQL from natural language. For Python, it supports code generation, refactoring, reviews, documentation, FastAPI development, and automated test generation. Go developers can generate tests, review idiomatic code, and analyze concurrency issues involving goroutines and channels.
Beyond programming languages, Samy also includes domain-specific skills for Google Cloud Platform services, database administration, analytics engineering, and business intelligence. This modular organization allows new capabilities to be introduced without changing the public API. Adding a new skill becomes an architectural extension rather than a redesign.
Retrieval Is Context
One lesson has become increasingly clear across our research on Minimum Context Signals. Large language models perform better when they receive the right context—not necessarily more context. For this reason, Samy distinguishes between domains that benefit from Retrieval-Augmented Generation (RAG) and those that can operate effectively using only the language model.
Engineering domains such as SQL, BigQuery, database administration, and analytics often require knowledge of platform-specific best practices. These skills automatically retrieve relevant documentation before generating responses. Other domains, such as Python refactoring or Go code review, rely primarily on the reasoning capabilities of the underlying model during this first iteration. This separation reflects an architectural decision rather than a technical limitation. Context should be injected only when it meaningfully improves the quality of the decision.
Observability as a First-Class Feature
One characteristic often missing from AI applications is observability. Traditional backend services expose metrics, logs, traces, and operational dashboards. AI systems frequently do not.
Samy treats telemetry as part of the platform rather than an optional add-on. Every skill invocation generates structured events containing metadata about the request, estimated token usage, retrieved knowledge, timestamps, and contextual information. These events are persisted and can later be analyzed to understand how the platform is being used.
Which skills are invoked most frequently?
Which engineering domains require additional knowledge?
Which workflows generate the highest computational cost?
By answering these questions, AI becomes measurable in the same way as any other production service.
Designed Like a Backend Service
Although Samy interacts with language models, its architecture follows familiar backend engineering principles. The platform is implemented with FastAPI, organized around a centralized skill registry that maps API routes to concrete implementations.
Dependency injection keeps skills independent from infrastructure concerns. Testing covers both unit and integration scenarios. Docker and Docker Compose support local execution. GitHub Actions automate continuous integration.
From an engineering perspective, Samy behaves much more like a backend platform than like a chatbot. This was intentional.
The long-term objective is to allow AI capabilities to be embedded directly into existing developer tools, internal platforms, automation pipelines, and engineering workflows.
Why Open Source?
Data S2 has always viewed software as a vehicle for research rather than only as a commercial product. Open source allows ideas to be inspected, challenged, improved, and extended by the broader engineering community.
By publishing Samy’s architecture publicly, we hope to encourage discussions around AI infrastructure, Retrieval-Augmented Generation, engineering assistants, observability, and programmable AI systems. We believe these topics deserve open experimentation.
Looking Ahead
The current version of Samy is only the beginning. Its architecture was intentionally designed to grow. New programming languages, cloud providers, analytics platforms, and engineering domains can be added through additional skills without changing the underlying API.
Retrieval mechanisms can evolve from keyword search to semantic retrieval and hybrid ranking. Telemetry can progress from heuristic token estimation to precise cost attribution across models, teams, and workflows.
Most importantly, Samy serves as a research platform. It allows us to investigate questions that extend beyond software engineering.
How should AI systems expose capabilities?
How much context is actually necessary to solve engineering problems?
How should AI services be observed, tested, and governed?
These questions connect directly with our broader research agenda around Minimum Context Signals, where the objective is not to maximize information but to identify the minimum contextual signals required for reliable decision-making. Samy is our first AI practical exploration of these ideas.
We do not see it as another AI assistant. We see it as an experiment in building AI as Infrastructure. And we believe that distinction will become increasingly important as artificial intelligence moves from isolated tools into the core architecture of modern software systems.

