TraceSpike
TraceSpike is a web app (with optional Slack/Teams integrations) for anomaly detection focused on distributed tracing and service dependencies, not just metrics. It ingests OpenTelemetry traces and builds a live service graph, then flags abnormal changes in latency, error rates, and call patterns between specific service-to-service edges. Instead of generic alerts, it surfaces “what changed” narratives: the exact endpoint, upstream/downstream suspects, and the time window where behavior diverged. The MVP targets teams already collecting traces but drowning in noisy alerts and slow triage. It’s an AI-assisted app: lightweight ML for baseline/seasonality plus an LLM layer that summarizes anomalies into actionable incident notes and suggested next checks (logs to search, recent deploys to compare). Realistically, it won’t replace full observability suites; it wins by making trace anomalies explainable and fast to act on.