Jetstream2 has recently unveiled a Large Language Model (LLM) inference service tailored to Jetstream2 users. This service provides access to advanced open-weight LLMs through two primary interfaces: a browser-based chat interface via Open WebUI similar to ChatGPT, and OpenAI-compatible inference APIs for seamless integration into various projects and applications.
The Jetstream2 inference service is especially valuable as it provides unlimited access to powerful large language models (LLMs) at no cost to researchers, educators, and students within the Jetstream2 community. Unlike LLMs running on personal machines or standard Jetstream2 instances, this service utilizes significantly larger and more capable models. Whether refining research papers, debugging code, brainstorming for projects, or summarizing complex texts, the Jetstream2 inference service can boost productivity and help facilitate innovation.
Users can explore AI-driven workflows with confidence knowing that security and privacy are key considerations for the Jetstream2 inference service. All data is processed exclusively within the IU Bloomington Data Center, ensuring that user interactions remain confidential. Prompt and response data are encrypted in transit, and the system does not store conversations or use data for AI training.
Users can engage with the broader Jetstream2 community for support and collaboration by joining the inference-service channel in the Jetstream2 community chat. This space enables discussions on best practices, troubleshooting, and sharing ideas into how these LLMs can be effectively applied across various domains.
As the state of the art advances rapidly, the models offered are subject to change. Current models available include:
- DeepSeek R1, a 671-billion-parameter chains-of-thought reasoning model
- Llama 4 Scout, our latest vision-language model
- Llama-3.3-70B-Instruct, a general-purpose instruct-tuned model
To access the inference service and connect with other users, visit the Inference Service Overview page.