I build small bridges between messy knowledge and working systems.
I am a PhD student in Sogang University's AI department, advised by Prof. Jong-Rak Kim, and a technical lead at DeepFountain. My work moves between formal mathematics, AI evaluation, and deployed AI systems.
The projects look scattered at first: Lean libraries, math benchmarks, university chatbots, river forecasting, pathology images, Buddhist text retrieval. I see them as variations of one maker problem: how do we turn a domain into something searchable, verifiable, extensible, and useful?

Jae-Hyun Baek
백재현
current roles
PhD student, Sogang AI
Formalized mathematical AI, coding theory, agentic evaluation
Technical lead, DeepFountain
Applied AI products, RAG systems, agent workflows
Research operator
Turning domain knowledge into papers, benchmarks, and deployed artifacts
The same motion, different domains.
A theorem, a PDF archive, a table, a product log, a set of screenshots: every project has some hidden knowledge shape. I try to make that shape explicit.
Many of my projects begin as rough scripts, local tools, or one-off experiments. The useful ones become libraries, dashboards, papers, or deployed systems.
If a human can understand a domain but an agent cannot search, reuse, or verify it, the knowledge is still half locked. That is why Lean, MCP, graph memory, and retrieval systems keep showing up in my work.
What I am trying to become
A researcher-builder who can turn a domain into an agent-operable research environment: searchable, verifiable, extensible, and useful enough to become either a paper or a product.
Mathematics
Coding theory, self-dual codes, combinatorial structures, and formal proof became the technical backbone of my research.
AI systems
RAG products, forecasting workflows, medical AI, and education tools pushed me to turn research ideas into systems other people can use.
Research memory
The thread tying them together is now clearer: build agent-operable knowledge environments for domains that matter.
The question I keep asking
What is the knowledge substrate here, and what would it take for an agent, prover, evaluator, or product team to use it without guessing?
How I decide what belongs in the workshop
The risk in my work is not lack of ideas; it is letting too many unrelated ideas blur the main direction. These are the filters I want future projects to pass.
Does this expose a domain substrate that agents, provers, or evaluators can reuse?
Can the work leave behind an artifact: a library, dataset, benchmark, deployed assistant, or reproducible paper trail?
Is there a real user, expert, or scientific question keeping the system honest?
Will saying yes sharpen the core thesis, or only add another disconnected obligation?
Signals that point back to the work.
I keep awards here as supporting evidence rather than the center of the site. Each item is tied to the research or product thread it helped make visible.

Best Presentation Award
ISIS 2025, International Symposium on Advanced Intelligent Systems
A Case Study on Alignment Faking in LLMs
Useful as an external signal that the portfolio includes evaluation and safety-oriented reasoning, not only product building.

Best Paper Award
2026 KIIS Spring Conference
Proposal of an LLM-Lean approach and architecture for automated mathematical problem solving
Public evidence that the IMDS Lean/AI4Math line moved from internal experiments into an official presentation track.
Source
Best Paper Award
Korean Institute of Intelligent Systems
Water Level Forecasting using AI
Connects the applied forecasting line to peer-visible recognition while the main site keeps the research question, not the certificate, at the center.

Best Paper Award
Korean Institute of Intelligent Systems
Performance Optimization of RAG-based LLMs
An early signal for the RAG/domain-document line that later connects to institutional, legal, and source-grounded assistants.

Encouragement Prize
1st AI Commercial Festival
AI-driven advertising business prototype
A product-oriented signal; useful as background, but intentionally kept secondary to the research-program narrative.