Simplex 0.9.0 "Edge Intelligence" represents a fundamental shift in how we think about AI optimization and deployment. This release introduces three major innovations:
- Self-Learning Annealing — Hyperparameters that optimize themselves through meta-gradients
- Edge Hive — The Cognitive Hive architecture running locally on any device, from smartwatches to desktops
- Actor-Based HTTP Server — A complete HTTP library where everything is an actor, with first-class hive integration
Building on v0.8.0's dual numbers for automatic differentiation, this release completes the foundation for AI systems that not only learn from data but learn how to learn—and can be deployed anywhere.
The Problem with Hyperparameters
Every ML practitioner knows the pain. You've designed your model, prepared your data, written your training loop—and now you face the real challenge: tuning hyperparameters.
- Learning rate: 0.001? 0.0001? With decay? What schedule?
- Temperature for simulated annealing: Start high, cool slowly? How slowly?
- Regularization strength: Too much and you underfit. Too little and you overfit.
- Dropout rate, batch size, warmup steps, momentum...
The traditional approach: grid search, random search, Bayesian optimization. All external to your model. All requiring multiple training runs. All expensive.
What if the optimization process could optimize itself?
Self-Learning Annealing: The Core Innovation
Self-Learning Annealing treats hyperparameters as dual numbers—values that carry both their magnitude and their sensitivity to change. This enables meta-gradients: gradients of the loss with respect to the hyperparameters themselves.
The Key Insight
Traditional simulated annealing uses a fixed cooling schedule: τ(t) = τ0 × 0.95t. Self-Learning Annealing replaces this with a learnable function: τ(t) = fθ(t, loss, grad) where θ is trained by meta-gradients.
How It Works
The system operates on two nested optimization loops:
Inner Loop (Standard Training):
- Uses current temperature τ to guide the search
- High τ = exploration (accept worse solutions)
- Low τ = exploitation (only accept improvements)
- Standard gradient descent on solution quality
Outer Loop (Meta-Learning):
- Computes meta-gradient: ∂Loss/∂τ
- If ∂Loss/∂τ > 0: temperature too low, stuck in local minimum → reheat
- If ∂Loss/∂τ < 0: temperature working well → continue cooling
- Updates temperature schedule parameters via Adam
use simplex_training::{LearnableSchedule, MetaOptimizer, SoftAcceptance};
// Create learnable temperature schedule
let schedule = LearnableSchedule::new()
.initial_temp(10.0)
.min_temp(0.01)
.hidden_dim(32); // MLP learns the schedule
// Meta-optimizer trains the schedule itself
let meta = MetaOptimizer::adam(0.001);
for epoch in 0..epochs {
// Inner loop: optimize solution with current schedule
let (solution, trajectory) = anneal(objective, schedule);
// Outer loop: optimize schedule from trajectory
let meta_loss = trajectory.compute_meta_loss();
schedule.backward(meta_loss);
meta.step(&mut schedule);
}
Soft Acceptance: Making Annealing Differentiable
Traditional Metropolis acceptance is a hard threshold:
// Traditional (non-differentiable)
if delta_E < 0 || random() < exp(-delta_E / T) {
accept(new_solution);
}
This has zero gradient—you can't backpropagate through a random comparison. Self-Learning Annealing replaces this with a soft, differentiable alternative:
// Soft acceptance (differentiable)
let accept_prob = sigmoid((threshold - delta_E) / temperature);
let weighted_solution = accept_prob * new_solution + (1.0 - accept_prob) * old_solution;
The sigmoid provides a smooth transition, allowing gradients to flow through the acceptance decision.
Performance Results
| Metric | Fixed Schedule | Learned Schedule | Improvement |
|---|---|---|---|
| Final Loss | 1.0 (baseline) | 0.85-0.90 | 10-15% lower |
| Training Steps | 100K | 70-80K | 20-30% fewer |
| Pruning (50% sparsity) | 5% quality loss | 2% quality loss | 60% less degradation |
| 4-bit Quantization | 8% quality loss | 4% quality loss | 50% less degradation |
The learned schedule adapts to problem structure. For easy optimization landscapes, it cools aggressively. For rugged landscapes with many local minima, it maintains higher temperatures longer and reheats when stuck.
Edge Hive: Local-First AI
The second major feature of 0.9.0 brings the Cognitive Hive architecture to edge devices. No cloud. No API calls. Complete privacy.
The Vision
Current AI deployment follows a client-server model: your device sends data to a cloud API, waits for inference, receives results. This has fundamental problems:
- Latency: Network round-trips add 100-500ms minimum
- Privacy: Your data leaves your device
- Availability: No internet = no AI
- Cost: API calls add up, especially at scale
Edge Hive inverts this model. The entire Cognitive Hive—specialists, shared SLM, memory systems—runs locally on your device.
Device-Adaptive Model Selection
Not all devices are equal. A smartwatch can't run the same model as a desktop workstation. Edge Hive automatically selects the appropriate SLM tier based on device capabilities:
| Device Tier | RAM | SLM Size | Example Devices |
|---|---|---|---|
| Pico | < 512MB | Pico-SLM (50M params) | Smartwatch, IoT sensors |
| Nano | 512MB - 2GB | Nano-SLM (500M params) | Phone, wearables |
| Micro | 2-4GB | Micro-SLM (1B params) | Tablet, older laptops |
| Mini | 4-8GB | Mini-SLM (3B params) | Laptop, desktop |
| Full | 8GB+ | Full-SLM (7B params) | Workstation, server |
use simplex_http::{EdgeHive, DeviceCapability, LocalSpecialist};
// Edge Hive auto-detects device capabilities
let hive = EdgeHive::new()
.auto_detect_capability() // Reads available RAM, NPU presence
.with_encryption(true) // AES-256-GCM for all data
.build()?;
// Create specialists that run entirely on-device
let analyst = LocalSpecialist::new("analyst")
.model(hive.slm()) // Uses device-appropriate SLM
.domain("code analysis")
.build()?;
// Process locally - no network required
let analysis = analyst.process(code_snippet).await?;
Security by Default
Edge Hive implements defense-in-depth security:
- Encryption at rest: AES-256-GCM for all stored data
- Key derivation: PBKDF2 with device-specific salt
- Transport security: TLS 1.3 for any network sync
- Message authentication: HMAC-SHA256 for integrity
Your specialist's memories, beliefs, and learned parameters never leave the device unless you explicitly sync them.
Specialist Types
Edge Hive supports four specialist deployment modes:
| Type | Behavior | Use Case |
|---|---|---|
LocalSpecialist |
Runs entirely on-device, never syncs | Privacy-critical, offline-only |
SharedSpecialist |
Syncs state with other devices you own | Personal assistant across devices |
FederatedSpecialist |
Participates in federated learning | Collaborative improvement |
SyncSpecialist |
Real-time sync across devices | Multi-device workflows |
Actor-Based HTTP Server (simplex-http)
The third major feature of 0.9.0 is a complete HTTP server library designed around the actor model. Unlike thread-pool servers like Actix or Axum, simplex-http treats everything as actors communicating via messages.
Design Philosophy
- No Threads: Concurrency via actors and async/await, not OS threads
- Hive-Native: First-class routing to cognitive specialists
- Message-Based: Request/response flows as actor messages
- Streaming: WebSocket and SSE for real-time specialist communication
Basic Server
use simplex_http::{HttpServer, Router, Request, Response};
let router = Router::new()
.get("/health", health_check)
.post("/api/query", query_handler);
HttpServer::bind("0.0.0.0:8080")
.router(router)
.graceful_shutdown(signal::ctrl_c())
.serve()
.await?;
Hive Integration
The key innovation: route HTTP requests directly to cognitive specialists. No serialization layer. No adapter code. Just declare the specialist and wire it to an endpoint:
specialist QuerySpecialist {
model: SLM,
type Input = QueryRequest;
type Output = QueryResponse;
async fn process(&self, input: QueryRequest) -> QueryResponse {
let response = self.model.complete(&input.query).await;
QueryResponse { answer: response.text, confidence: response.confidence }
}
}
// Build hive and route directly to specialists
let hive = Hive::builder()
.add_specialist(QuerySpecialist::new(SLM::load("query-model")))
.build()
.await?;
let router = Router::new()
.post("/api/query", hive.handler::<QuerySpecialist>())
.with(Logger::new())
.with(RateLimiter::new(100, Duration::from_secs(60)));
HttpServer::bind("0.0.0.0:8080")
.router(router)
.with_hive(hive)
.serve()
.await?;
Actor Middleware
Middleware are actors too, enabling stateful cross-cutting concerns:
actor AuthMiddleware {
secret_key: String,
impl Middleware {
async fn handle(&self, req: Request, next: Next) -> Response {
match self.verify_token(req.header("Authorization")).await {
Ok(user) => {
let mut req = req;
req.set_extension(user);
next.run(req).await
}
Err(_) => Response::unauthorized(),
}
}
}
}
Real-Time Streaming
WebSocket and Server-Sent Events for streaming responses from specialists:
// WebSocket for bidirectional communication
actor HiveStreamHandler {
hive: HiveRef,
impl WebSocketHandler {
async fn on_message(&mut self, ws: &WebSocket, msg: Message) {
let stream = self.hive.stream::<QuerySpecialist>(msg.text()).await;
while let Some(chunk) = stream.next().await {
ws.send(Message::Text(chunk)).await;
}
}
}
}
// SSE for server-push streaming
async fn stream_reasoning(req: Request) -> Response {
let hive = req.extension::<HiveRef>().unwrap();
let input: ReasoningRequest = req.body_json().await?;
Response::sse(|stream| async move {
let reasoning = hive.stream::<ReasoningSpecialist>(input).await;
while let Some(step) = reasoning.next().await {
stream.send(SseEvent::new(step).with_event("reasoning_step")).await?;
}
Ok(())
})
}
Built-in Middleware
| Middleware | Purpose |
|---|---|
Logger |
Request/response logging |
Cors |
Cross-Origin Resource Sharing |
RateLimiter |
Rate limiting per IP |
Timeout |
Request timeout enforcement |
Compression |
Gzip/Brotli response compression |
AuthMiddleware |
JWT/Bearer token authentication |
The simplex-training Library
Supporting Self-Learning Annealing is a new training library that makes all hyperparameters learnable:
use simplex_training::{
LearnableLRSchedule, // Learning rate
LearnableDistillation, // Knowledge distillation temperature
LearnablePruning, // Sparsity schedules
LearnableQuantization, // Precision reduction
MetaTrainer, // Orchestrates meta-learning
};
let trainer = MetaTrainer::new()
.with_learnable_lr() // LR adapts to loss landscape
.with_learnable_distillation() // Distillation temp adapts to teacher
.with_learnable_pruning() // Pruning rate adapts to layer sensitivity
.with_learnable_quantization(); // Bit-width adapts to weight distribution
let result = trainer.meta_train(&model, &data).await?;
Each learnable schedule is parameterized by a small MLP that takes training state (step, loss, gradient norm) and outputs the hyperparameter value. The MLP parameters are optimized by meta-gradients.
Comprehensive Test Infrastructure
Version 0.9.0 includes a complete restructure of the test suite: 156 tests across 13 categories with consistent naming conventions.
| Category | Tests | Coverage |
|---|---|---|
| language/ | 40 | Core language features, syntax, semantics |
| types/ | 24 | Type system, generics, pattern matching |
| ai/ | 17 | Cognitive framework, specialists, hives |
| neural/ | 16 | Neural IR, neural gates, differentiability |
| stdlib/ | 16 | Standard library functions |
| toolchain/ | 14 | Compiler, linker, build system |
| integration/ | 7 | End-to-end workflows |
| + 6 more | 22 | Runtime, async, learning, actors, observability, basics |
Test naming follows a consistent convention: unit_ for unit tests, spec_ for specification compliance, integ_ for integration, and e2e_ for end-to-end workflows.
Building on v0.8.0: Dual Numbers Foundation
Self-Learning Annealing is only possible because of v0.8.0's dual numbers—native forward-mode automatic differentiation with zero runtime overhead.
use simplex_training::dual;
// Dual number: value + ε × derivative
let x = dual::variable(3.0); // 3 + 1ε
// All operations automatically track derivatives
let f = x * x + x.sin();
println!("f(3) = {}", f.value); // 9.1411...
println!("f'(3) = {}", f.derivative); // 6.9899... (2x + cos(x))
This compiles to identical assembly as manually computing derivatives. No tape. No graph. Just struct operations that the optimizer inlines completely.
Forward vs Reverse Mode: Why It Matters for Self-Annealing
Automatic differentiation comes in two flavors, and the choice between them has profound implications for self-learning systems:
| Mode | Mechanism | Complexity | Best For |
|---|---|---|---|
| Forward Mode | Dual numbers propagate derivatives alongside values | O(n) for n inputs | Few inputs, many outputs |
| Reverse Mode | Build computation graph, backpropagate from output | O(m) for m outputs | Many inputs, few outputs |
Reverse mode (backpropagation) dominates deep learning because neural networks have millions of parameters (inputs) but typically one scalar loss (output). Computing ∂Loss/∂wi for all weights in one backward pass is efficient.
Forward mode shines in the opposite scenario: few inputs, many outputs. This is exactly the situation with self-annealing hyperparameters:
- Inputs: A handful of hyperparameters (temperature, learning rate, momentum)—typically 3-10 values
- Outputs: The entire loss trajectory—thousands of loss values across training
To compute meta-gradients with reverse mode, you'd need to store the entire computation graph across all training steps—memory explosion. Forward mode with dual numbers computes ∂Losst/∂τ at every step t with zero additional memory:
// Forward mode: derivative computed alongside value
let tau = dual::variable(current_temperature); // tau + 1ε
for step in 0..training_steps {
let loss = train_step(model, data, tau);
// loss.derivative is ∂Loss/∂τ - computed for free!
meta_gradient += loss.derivative;
// Update temperature using its own gradient
tau = tau - meta_lr * dual::constant(loss.derivative);
}
This is why Simplex's dual number foundation was essential for Self-Learning Annealing. Reverse-mode AD would require either:
- Storing computation graphs across thousands of steps (memory-prohibitive)
- Truncating backpropagation (losing long-range dependencies)
- Expensive checkpointing schemes (slow and complex)
Forward mode with dual numbers sidesteps all of this. The meta-gradient flows naturally alongside the computation, enabling optimization that optimizes itself without architectural compromises.
Migration Guide
Upgrading from v0.8.x to v0.9.0 is straightforward:
No Breaking Changes
All existing code compiles unchanged. Self-Learning Annealing and Edge Hive are additive features.
New Imports
// Self-Learning Annealing
use simplex_training::{LearnableSchedule, MetaOptimizer, SoftAcceptance};
// Edge Hive
use simplex_http::{EdgeHive, LocalSpecialist, DeviceCapability};
// New stdlib modules
use std::compress::{gzip, gunzip};
use std::sync::mpsc::{channel, unbounded};
use std::crypto::{bcrypt_hash, generate_token};
Optional Migration: Fixed to Learned
// Before: fixed schedule
let temp = initial_temp * decay_rate.powi(step);
// After: learned schedule
let schedule = LearnableSchedule::new().initial_temp(initial_temp);
let temp = schedule.temperature(step, loss, grad_norm);
What's Next: Roadmap to 1.0
With v0.9.0, Simplex has the core infrastructure for self-optimizing AI systems. The path to 1.0 focuses on performance and tooling:
- v0.10.0 - GPU Acceleration: CUDA and Metal backends for tensor operations
- v0.11.0 - Distributed Hive: Cross-node hive coordination with Raft consensus
- v0.12.0 - Developer Tooling: VS Code plugin, interactive debugger, REPL
- v1.0.0 - Production Release: API stability, comprehensive docs, certified model zoo
Try It Today
Simplex 0.9.0 is available now:
The future of AI isn't just models that learn from data. It's models that learn how to learn, running privately on the devices we already own. Simplex 0.9.0 is a step toward that future.