EchoGraph

EchoGraph is a mono-repository that orchestrates the ingestion, enrichment, matching, and human validation of regulatory documents against a corpus of cloud provider guidelines. It provides ready-to-run pipelines, APIs, and user interfaces to power governance and compliance mapping workflows.

Repository Layout

/ingestion      # Python ingestion workers and n8n workflow definitions
/processing     # Text cleanup, chunking, embeddings, and relationship discovery
/api            # FastAPI backend that exposes sections, matches, and metadata
/frontend       # React single-page application for reviewers and knowledge workers
/infra          # Docker, Kubernetes, and CI/CD automation
/docs           # Architecture, playbooks, and tutorials
/tests          # Unit and integration tests
/data           # Storage location for raw, cleaned, and demo datasets

Quick Start

Prerequisites

Docker and Docker Compose for local orchestration
Python 3.10+
Node.js 18+
Poetry (optional) for Python dependency management
pnpm or npm/yarn for frontend dependency management

Bootstrap the Environment

make bootstrap

The bootstrap script will:

Create Python virtual environments for ingestion and processing workers
Install FastAPI backend dependencies
Install frontend dependencies with pnpm
Download demo documents into data/demo_docs

Run the Stack Locally

docker compose up --build

Services provided:

ingestion-worker: Executes scheduled or on-demand document ingestion jobs
processing-worker: Cleans, chunks, and embeds documents, and writes vectors to Qdrant
postgres: Stores canonical document sections, matches, and reviewer annotations
qdrant: Holds document embeddings for similarity search
api: FastAPI server serving guideline data and matches
frontend: React app for exploring guidelines and validating matches

The stack ships with a bundled Caddy reverse proxy. When running locally you can access the reviewer UI at https://localhost (after trusting the autogenerated certificate) and the API at https://localhost/api. On remote hosts Caddy listens on ports 80/443 and forwards requests to the internal frontend container. You can browse via https://<vm-ip> (recommended) or http://<vm-ip> if you need a quick check before trusting the generated certificate. Directly visiting http://<vm-ip>:5173 is still blocked because that port only binds to the loopback interface inside the VM.

Reviewer Experience Highlights

The reviewer UI now supports end-to-end analysis without leaving the browser:

Upload documents directly – Drop new cloud guidelines or regulatory frameworks, and the backend will extract, segment, embed, and generate candidate matches automatically.
Interactive footnotes – Selecting a guideline reveals inline highlights that act like live footnotes. Hovering or clicking a highlight surfaces the linked regulation text, similarity rationale, and confidence estimates.
Context-rich inspection – The match panel summarizes rationale, excerpts, and metadata so IT teams can quickly judge whether internal guidance aligns with external obligations.

Data Lifecycle

Ingestion: Python workers, orchestrated by n8n, download documents, extract text using pdfplumber, python-docx, or Apache Tika, and write raw JSONL files to data/raw.
Processing: Cleanup and chunking pipelines normalize text and create embeddings using Sentence Transformers. Cleaned chunks and metadata are written to data/processed and mirrored into Qdrant or pgvector.
Relationship Discovery: Matching jobs look up related regulation sections for each cloud guideline chunk, summarize the rationale with an LLM, and produce candidate matches.
Human Validation: Reviewers validate or reject matches in the frontend UI; their decisions are persisted in PostgreSQL.

Documentation

Contributing

Please read CONTRIBUTING.md for coding standards, pull request etiquette, and how to participate in the community.

License

EchoGraph is released under the GPL-3.0 license. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EchoGraph

Repository Layout

Quick Start

Prerequisites

Bootstrap the Environment

Run the Stack Locally

Reviewer Experience Highlights

Data Lifecycle

Documentation

Contributing

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github/workflows		.github/workflows
api		api
data		data
docs		docs
frontend		frontend
infra		infra
ingestion		ingestion
processing		processing
scripts		scripts
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

License

MarcusGraetsch/EchoGraph

Folders and files

Latest commit

History

Repository files navigation

EchoGraph

Repository Layout

Quick Start

Prerequisites

Bootstrap the Environment

Run the Stack Locally

Reviewer Experience Highlights

Data Lifecycle

Documentation

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages