The proposed project aims to develop a cutting-edge AI platform powered by a domain-specific Large Language Model (LLM) uniquely tailored for the Turkish language. The platform will enhance information extraction, semantic analysis, and decision-making processes in data-intensive sectors, particularly in legal, financial, and administrative domains. Leveraging prior experience from TÜBİTAK-supported initiatives, including semantic search, document classification, and legal reasoning tools, this project moves a step further by integrating deep language understanding with scalable automation.
A key innovation is the creation of the most advanced Turkish LLM, which will be trained to understand the complex morphological structure, agglutinative syntax, and domain-specific terminologies of Turkish—areas where existing international models like GPT, LLaMA, or Qwen underperform. The Turkish LLM will serve as the foundation of Mecellem, an intelligent knowledge processing platform developed by New Mind. Mecellem will allow users to upload documents, extract structured knowledge, and interact with AI through a natural language interface.
The platform will also offer a "Model Page", enabling users to train, fine-tune, or select from over 200 pre-trained models developed by New Mind. These models will be embedded in an automated pipeline that can:
Preprocess and analyze documents,
Generate and enrich knowledge graphs (KGs),
Conduct semantic reasoning,
Answer user queries through chat-based AI interaction,
Adapt dynamically to different sectors via model customization.
A core component of the project is the automated generation of synthetic data to address the scarcity of high-quality, domain-specific Turkish corpora. Using advanced NLP techniques—including text generation, paraphrasing, rule-based generation, and simulation-based modeling—the system will produce GDPR and KVKK-compliant datasets that enhance model performance without compromising data privacy.
The project also emphasizes automated knowledge graph construction, powered by AI models that can detect, predict, and visualize semantic relationships across diverse document types. This will transform unstructured data into contextualized, interconnected knowledge networks, supporting strategic insights and accelerating decision-making.
From a technical perspective, the system integrates modern AI stack components (e.g., Apache Airflow, MLflow, and scalable model training infrastructure) to automate the entire data-to-insight pipeline. Users without programming or AI expertise will be able to configure, monitor, and evaluate their models within an intuitive interface, bringing AI capabilities to a much broader user base.
Strategic benefits include:
Democratization of AI across SMEs and large enterprises in Turkey,
Acceleration of digital transformation via affordable, scalable, and sector-specific AI tools,
Reinforcement of national AI capacity aligned with Türkiye’s 12th Development Plan and the National AI Strategy (2021–2025),
Reduction of foreign dependency by developing sovereign AI technologies.
In conclusion, this project will not only produce the most competent Turkish LLM to date but also deliver a robust, AI-powered ecosystem where users can create dynamic knowledge networks, enhance legal and business operations, and interact naturally with AI. It represents a significant leap forward in Türkiye’s journey toward becoming a global leader in AI innovation and data-driven governance.
Muhammed Sadullah Güzel, New Mind AI, Türkiye