Tech5 min read
Graph Queries in a Unified Database: From Cypher to Posting Lists
by Jaepil Jeong | March 26, 2026
Graph databases solve relationship-heavy problems elegantly, but adding a separate graph system alongside your relational database creates operational complexity. We explain how Cognica integrates graph queries into its unified algebra, enabling Cypher and SQL to compose in a single transaction without data duplication.
Read Post
Research7 min read
Vector Scores Are Not Probabilities: Likelihood Ratio Calibration for Hybrid Search
by Jaepil Jeong | March 25, 2026
A cosine similarity of 0.85 tells you an angle, not a probability. We show how to transform vector similarity scores into calibrated relevance probabilities using distributional statistics that ANN indexes already compute — completing the probabilistic unification of text and vector retrieval.
Read Post
Research11 min read
Why Sigmoid? The Mathematical Inevitability Behind Bayesian BM25
by Jaepil Jeong | February 23, 2026
Sigmoid is not a design choice — it is a mathematical theorem. We show why the sigmoid function is the unique valid transform for converting BM25 scores to probabilities, completing Robertson's Probability Ranking Principle after 50 years.
Read Post
Tech15 min read
Building a Probabilistic Search Engine: Bayesian BM25 and Hybrid Search
by Jaepil Jeong | February 1, 2026
Modern search systems struggle to combine lexical matching with semantic understanding. We explore how we built a probabilistic ranking framework in Cognica Database that transforms BM25 scores into calibrated probabilities, enabling principled fusion of text and vector search results.
Read Post
Tech18 min read
JIT Toolchain: Building a Disassembler and CPU Emulator for Database Development
by Jaepil Jeong | January 19, 2026
The essential infrastructure that makes Copy-and-Patch JIT development and debugging practical. We explore the multi-architecture disassembler for validation and software CPU emulator for cross-platform testing and debugging.
Read Post
Tech16 min read
Copy-and-Patch JIT: Achieving Native Code Performance with Microsecond Compilation
by Jaepil Jeong | January 17, 2026
How Cognica Database Engine breaks the JIT compilation latency barrier. We explore Copy-and-Patch JIT compilation, a technique that achieves 2-10x speedup over interpretation while keeping compilation time under one millisecond per kilobyte of bytecode.
Read Post
Insights4 min read
An AI Database That Works Identically On-Device
by Tim Yang | December 23, 2025
We examine the database architecture changes required by on-device AI. Just as SQLite was the answer for on-device computing, on-device AI requires a new database that integrates transactions, analytics, full-text search, and vector search. We explain why Cognica works identically on-device and on servers.
Read Post
Insights5 min read
Structural Limitations of Legal Case Search and the Need for Single DB with Vector Search
by Tim Yang | December 9, 2025
This article provides a technical analysis of why legal case search is challenging in the legal services market. We examine the structural characteristics of legal case data and the limitations of existing distributed architectures (RDB + ElasticSearch + Vector DB), and explain why integrated search based on a single database is necessary.
Read Post
Engineering12 min read
Automated Financial Statement Extraction from PDFs Using LLMs
by Cognica Team | November 18, 2025
We introduce the process of building a system that automatically extracts and normalizes financial statements from PDFs in various formats using Large Language Models (LLMs). We cover data model design with Structured Output and Pydantic, the extraction process through Google Gemini API, and post-processing methods applicable to real-world scenarios, all implemented in about 200 lines of code.
Read Post
Insights3 min read
Distributed Databases: A Structural Constraint in the AI Era
by Tim Yang | November 17, 2025
Exploring how function-based distributed database architectures become structural constraints in the AI era. We examine the limitations and complexity of traditional approaches combining OLTP, OLAP, FTS, and Vector DB, and introduce Cognica's unified database as a technical turning point.
Read Post
1 / 3








