Manage embeddings and vector databases for contextual retrieval and knowledge grounding.
Translate backend intelligence into usable insights for health workers, dashboards, chatbots, and community feedback loops.
Collaborate with program teams to ensure AI models reflect real public health needs, ethics, local contexts, and are validated against field realities, including low-connectivity environments.
Support rapid data visualization for program dashboards and government review systems
Establish data security, versioning, and model monitoring best practices.
Skills and Experience
Education: B.Tech/M.Tech in Computer Science, Data Science, or related discipline.
Experience: 3–6 years of experience in backend, data engineering, or AI-driven product development; exposure to health, GovTech, or social impact data preferred.
Databases: PostgreSQL, BigQuery, SQLite; vector DBs such as Pinecone, FAISS, or Chroma.
AI/LLM: Gemini API, LangChain, prompt design and orchestration.
Speech Tech: Experience with ASR (Whisper, Google Speech) and TTS (Coqui, ElevenLabs). Experience applying NLP to unstructured text/audio for community feedback and AI-enabled sensemaking.
Cloud: Google Cloud Platform (Cloud Run, Cloud Functions, BigQuery, Secret Manager); experience with containerized workflows using Docker.
Data Pipelines: End-to-end ETL development, schema design, data validation, logging, and performance monitoring.
Visualization: Experience with tools such as Streamlit, Gradio, Looker Studio, Power BI, and user journey mapping for rapid analytics and insight generation.
Personal Attributes
A curious, iterative builder with strong attention to data quality and integrity.
Belief in AI for inclusion and public good.
Comfortable explaining complex systems to non-technical stakeholders.
Collaborative, detail-oriented, and driven by problem-solving.