FocusKPI is looking for an LLM or GenAI Application Engineer to join one of our clients, a high-tech SaaS company. The role will be a part of the core GenAI AI development team within the client. It primarily contributes to LLM-based application development, evaluation, and testing of new features, as well as core technology, such as an agent framework, utilizing the latest technology stack, LLM technology.
Work Location: Mountain View, CA
Duration: 12-month contract; Hybrid role (4 days per week onsite)
Pay Range: $110/hr to $120/hr
**No C2C resumes are considered**
Design, train, and fine-tune large language models (e.g., GPT, LLaMA, PaLM) for various applications.
Research cutting-edge techniques in natural language processing (NLP) and machine learning to improve model performance.
Explore advancements in transformer architectures, multi-modal models, and emergent AI behaviors.
Collect, clean, and preprocess large-scale text datasets from diverse sources.
Develop and implement data augmentation techniques to improve training data quality.
Ensure data is free from bias and aligned with ethical AI standards.
Optimize model architecture to improve accuracy, efficiency, and scalability.
Implement techniques to reduce latency, memory footprint, and inference time for real-time applications.
Collaborate with MLOps teams to deploy LLMs into production environments using Docker, Kubernetes, and cloud.
Develop robust evaluation pipelines to measure model performance using key metrics like accuracy, perplexity, BLEU, and F1 score.
Continuously test for bias, fairness, and robustness of language models across diverse datasets.
Conduct A/B testing to evaluate model improvements in real-world applications.
Stay updated with the latest advancements in generative AI, transformers, and NLP research.
Contribute to research papers, patents, and open-source projects—present findings and insights at conferences and internal knowledge-sharing sessions.
Required to have 5-7 years of industrial work experience along with research/academic experience.
Advanced degree in Computer Science, Artificial Intelligence, Data Science, or a related field.
Strong programming skills.
Expertise with LLM and GenAI application development.
Experience with deep learning frameworks such as TensorFlow, PyTorch, or JAX.
Hands-on experience with transformer-based models (e.g., GPT, BERT, RoBERTa, LLaMA).
Expertise in natural language processing (NLP) and sequence-to-sequence models.
Familiarity with Hugging Face libraries and OpenAI APIs.
Experience with MLOps tools like Docker, Kubernetes, and CI/CD pipelines.
Strong understanding of distributed computing and GPU acceleration using CUDA.
Knowledge of reinforcement learning and RLHF (Reinforcement Learning with Human Feedback).
Top 3 skills (must have):