---
license: other
license_name: flux-1-dev-non-commercial-license
license_link: LICENSE
---

# FLUX.1 [dev] -- Flumina Server App (FP8 Version)

This repository contains an implementation of the FLUX.1 [dev] [FP8 version](https://github.com/aredden/flux-fp8-api), which uses float8 numerics instead of bfloat16. This optimization leads to 2x faster performance in inference when compared to previous versions, making it ideal for high-speed, resource-efficient applications on Fireworks AI’s Flumina Server App toolkit.

![Example output](example.png)

## Getting Started -- Serverless deployment on Fireworks

This FP8 Server App is deployed to Fireworks as-is in a "serverless" deployment, enabling you to leverage its performance boost without needing to manage servers manually.

Grab an [API Key](https://fireworks.ai/account/api-keys) from Fireworks and set it in your environment variables:

```bash
export API_KEY=YOUR_API_KEY_HERE
```

### Text-to-Image Example Call

```bash
curl -X POST 'https://api.fireworks.ai/inference/v1/workflows/accounts/fireworks/models/flux-1-dev-fp8/text_to_image' \
    -H "Authorization: Bearer $API_KEY" \
    -H "Content-Type: application/json" \
    -H "Accept: image/jpeg" \
    -d '{
        "prompt": "Woman laying in the grass",
        "aspect_ratio": "16:9",
        "guidance_scale": 3.5,
        "num_inference_steps": 30,
        "seed": 0
    }' \
    --output output.jpg
```

![Output of text-to-image](t2i_output.jpg)

## Deploying FLUX.1 [dev] to Fireworks On-Demand

FLUX.1 [dev] (bfloat16) is available on Fireworks via [on-demand deployments](https://docs.fireworks.ai/guides/ondemand-deployments). It can be deployed in a few simple steps:

### Prerequisite: Install the Flumina CLI

The Flumina CLI is included with the [fireworks-ai](https://pypi.org/project/fireworks-ai/) Python package. It can be installed with pip like so:
```bash
pip install 'fireworks-ai[flumina]>=0.15.7'
```

Also get an API key from the [Fireworks site](https://fireworks.ai/account/api-keys) and set it in the Flumina CLI:

```bash
flumina set-api-key YOURAPIKEYHERE
```

### Creating an On-Demand Deployment

`flumina deploy` can be used to create an on-demand deployment. When invoked with a model name that exists already, it will create a new deployment in your account which has that model:

```bash
flumina deploy accounts/fireworks/models/flux-1-dev-fp8
```

*Note that fp8 FLUX models require `--accelerator-type H100` to successfully deploy*

When successful, the CLI will print out example commands to call your new deployment, for example:

```bash
curl -X POST 'https://api.fireworks.ai/inference/v1/workflows/accounts/fireworks/models/flux-1-dev-fp8/text_to_image?deployment=accounts/u-6jamesr6-63834f/deployments/a0dab4ba' \
    -H 'Authorization: Bearer API_KEY' \
    -H "Content-Type: application/json" \
    -d '{
        "prompt": "<value>",
        "aspect_ratio": "16:9",
        "guidance_scale": 3.5,
        "num_inference_steps": 30,
        "seed": 0
    }'
```

Your deployment can also be administered using the Flumina CLI. Useful commands include:
* `flumina list deployments` to show all of your deployments
* `flumina get deployment` to get details about a specific deployment
* `flumina delete deployment` to delete a deployment

## What is Flumina?

Flumina is Fireworks.ai’s new system for hosting Server Apps that allows users to deploy deep learning inference to production in minutes, not weeks.

## What does Flumina offer for FLUX models?

Flumina offers the following benefits:

* Clear, precise definition of the server-side workload by looking at the server app implementation (you are here)
* Extensibility interface, which allows for dynamic loading/dispatching of add-ons server-side. For FLUX:
  * ControlNet (Union) adapters
  * LoRA adapters
* Off-the-shelf support for standing up on-demand capacity for the Server App on Fireworks
  * Further, customization of the logic of the deployment by modifying the Server App and deploying the modified version.
* Now with support for FP8 numerics, delivering enhanced speed and efficiency for intensive workloads.

## Deploying Custom FLUX.1 [dev] FP8 Apps to Fireworks On-demand

Coming soon!