Surprise Castle
/Generative AI on Kubernetes: Operationalizing Large Language Models
Generative AI on Kubernetes: Operationalizing Large Language Models

Generative AI on Kubernetes: Operationalizing Large Language Models - Paperback

$59.99
Quantity
01

Pay over time for orders over $35.00 with

Availability:In StockContributor:Roland Huß, Daniele ZoncaPublish date:4/7/2026Pages:404
Language:EnglishPublisher:O'Reilly MediaISBN-13:9781098171926ISBN-10:1098171926UPC:9781098171926Book Category:ComputersBook Subcategory:Artificial Intelligence, Data ScienceBook Topic:Generative AI, Natural Language Processing, Neural NetworksSize:9.19 x 7.00 x 0.83 inchesWeight:1.422Product ID:SCXS8AJZWK

Generative AI is revolutionizing industries, and Kubernetes has fast become the backbone for deploying and managing these resource-intensive workloads. This book serves as a practical, hands-on guide for MLOps engineers, software developers, Kubernetes administrators, and AI professionals ready to combine AI innovation with the power of cloud native infrastructure. Authors Roland Hu and Daniele Zonca provide a clear road map for training, fine-tuning, deploying, and scaling GenAI models on Kubernetes, addressing challenges like resource optimization, automation, and security along the way.

With actionable insights with real-world examples, readers will learn to tackle the opportunities and complexities of managing GenAI applications in production environments. Whether you're experimenting with large-scale language models or facing the nuances of AI deployment at scale, you'll uncover expertise you need to operationalize this exciting technology effectively.

  • Learn how to deploy LLMs more efficiently with optimized inference runtimes
  • Get hands-on with GPU scheduling, including hardware detection and multinode scaling
  • Monitor and understand LLM-specific metrics like Time to First Token and token throughput
  • Know when to fine-tune a model or when retrieval augmentation is the better choice
  • Discover how to evaluate models with standardized benchmarks before committing GPU resources
  • Learn to run agentic applications with secure tool integration, identity management, and persistent state
Language:EnglishPublisher:O'Reilly MediaISBN-13:9781098171926ISBN-10:1098171926UPC:9781098171926Book Category:ComputersBook Subcategory:Artificial Intelligence, Data ScienceBook Topic:Generative AI, Natural Language Processing, Neural NetworksSize:9.19 x 7.00 x 0.83 inchesWeight:1.422Product ID:SCXS8AJZWK
Huß, Roland: - Dr. Roland Huss is a seasoned software engineer with over 25 years of experience in the field. Currently working at Red Hat, he is the architect of OpenShift Serverless and a former member of the Knative TOC. Roland is a passionate Java and Golang coder and a sought-after speaker at tech conferences. An advocate of open source, he is an active contributor and enjoys growing chili peppers in his free time.Zonca, Daniele: - Daniele Zonca is a Senior Principal Software Engineer and Architect for model serving of Red Hat OpenShift AI, Red Hat's flagship AI product combining multiple stacks.
Publisher: O'Reilly Media

Free shipping on orders over $75. Standard shipping takes 3-7 business days. Returns accepted within 30 days of purchase.

Recently Viewed

View All