Lead Data Scientist - Document Intelligence & AI AgentsLocation: New York, NY (3 days/week in office)
Type: Full-Time / Direct Hire
About Our CompanyWe are one of the largest mutual insurance companies dedicated to inspiring well-being and helping our customers realize their dreams through comprehensive insurance and financial solutions. We serve 29 million customers by delivering transformative products and services that support their overall well-being—mind, body, and wallet.
Our multidisciplinary teams work across the enterprise to create innovative solutions for the evolving needs of our customers, their families, and the communities we serve. Through our data-driven decision-making approach and customer-centric philosophy, we develop and integrate cutting-edge AI and advanced analytics that achieve breakthrough improvements in customer experience, risk management, and operational efficiency.
About the RoleWe are seeking a
Lead Data Scientist to drive innovation agenda in
Document Intelligence, Intelligent Knowledge Retrieval, and AI Automation with Agentic AI and Generative AI.
In this high-impact role, you will:
- Set the multi-year vision for AI-driven document and knowledge solutions.
- Build and lead a team of data scientists delivering enterprise-scale ML/AI products.
- Design and deploy solutions in Intelligent Document Processing (IDP), Retrieval-Augmented Generation (RAG), and AI agents/bots.
- Collaborate with senior executives and business leaders to transform operations, improve efficiency, and enhance customer experience.
Key ResponsibilitiesLeadership & Strategy- Define the roadmap for Document Intelligence, Knowledge Retrieval, and AI Agents, aligned with Guardian’s digital transformation.
- Act as an AI thought leader—evaluate emerging technologies, frame build-vs-buy decisions, and make investment cases.
- Define and track north-star metrics (e.g., straight-through-processing %, cycle time, cost-to-serve, accuracy, call deflection).
- Lead, mentor, and scale a high-performing data science team with a culture of experimentation and delivery.
Product & Delivery (IDP, Retrieval, AI Agents)- Own the end-to-end lifecycle: discovery → pilot → rollout for IDP use cases (classification, OCR/layout understanding, entity & table extraction, summarization).
- Develop retrieval-augmented solutions (RAG, grounded generation, vector search, routing) for policy, clinical, and knowledge documents.
- Build agentic automations and AI bots for intake, triage, and workflow orchestration with human-in-the-loop oversight.
- Translate ambiguous problems into measurable AI products with clear SLAs and operational runbooks.
Research & Innovation- Evaluate advancements in Deep Learning, LLMs, multimodal document models, and agent frameworks.
- Partner with universities and industry; contribute to patentable IP and reusable components.
- Rapidly prototype with real-world data; move winning approaches to production with A/B and champion-challenger testing.
Delivery, Responsible AI & Integration- Implement multi-layer evaluation: accuracy, retrieval quality, grounding/hallucination, safety, latency, and cost.
- Optimize compute & inference (GPU/distributed, caching, batching, routing) for performance and efficiency.
- Embed Responsible AI principles: privacy (PHI/PII), explainability, bias/fairness monitoring, safety, and governance (prompts, datasets, versioning).
- Partner with Product, Engineering, Data, and business leaders to prioritize backlog and embed solutions in workflows.
- Lead vendor evaluations and represent Guardian at executive and industry forums.
What We’re Looking ForMust-Have QualificationsEducation:- PhD + 6+ years, OR Master’s + 8+ years, OR Bachelor’s + 10+ years in Computer Science, Engineering, Applied Math, or related field.
Experience:- 7+ years in ML/AI solution development.
- 3+ years leading and mentoring data science teams.
Technical Expertise- Document Intelligence (IDP): OCR, layout understanding, document classification, entity/table extraction, summarization.
- LLMs & Generative AI: prompt design/tuning (incl. PEFT/fine-tuning), function/tool calling, safety guardrails, RAG (vector search, chunking, grounding).
- AI Bots & Agentic Automation: designing and deploying AI agents for intake, triage, and workflow orchestration with human-in-the-loop.
- Programming & Frameworks: Python, PyTorch/TensorFlow; Spark or Ray; GPU/distributed computing.
- MLOps: CI/CD for models/prompts, MLflow/W&B, feature/embedding stores, monitoring/observability, A/B and champion-challenger testing.
- Evaluation: document accuracy, retrieval quality, grounding/hallucination, safety, latency/cost linked to KPIs.
- Responsible AI: privacy (PHI/PII), explainability, bias monitoring, governance, model risk documentation.
Leadership & Communication- Proven ability to lead teams of 4+ data scientists.
- Strong executive communication and storytelling skills.
- Experience building investment cases and managing vendors.
- Collaboration with engineering, product, and business leaders.
Why Join- Impact at scale: Lead AI solutions transforming underwriting, claims, and customer service.
- Cutting-edge innovation: Work with LLMs, multimodal models, and agent-based AI.
- Leadership visibility: Direct exposure to C-suite and strategic initiatives.
- Culture of innovation: Collaborate with researchers, engineers, and industry partners.
- Competitive package: Strong salary, comprehensive health benefits, 12% 401(k) match, 25% bonus.